Kaggle Meetup-July 2018
Details
We are very excited to have BurdaForward GmbH hosting our July Meetup! A very big Thank You' from Kaggle Munich.
BurdaForward GmbH
St. Martin-Straße 66
81541 München
Please ask for "Sunyard" We are there
BurdaForward publishes and markets a large selection of very popular digital journalistic services, including FOCUS Online, Chip, HuffingtonPost, Finanzen100, The Weather Channel and NetMoms, reaching almost 33 million people in Germany – 50% of German internet users. Most BurdaForward portals are considered leaders in their field.
The idea of the “Pursuit of happiness” is lived within the mother company, Hubert Burda Media, since more than 100 years. BurdaForward, as a young tech-media company, works hand in hand with its numerous journalists, developers, editors, communication experts as well as digital analysts, to focus every day on the needs of its users and enhance the life of every individual.
AGENDA
We are going to have two hands-on talks:
- Kunal Gautham: Spark for Data Science: Theory and Practice
Kunal works as Big data Consultant at Hortonworks. He has great experience working with open source data pipeline stack. He will introduce in about 20-25 min the theory behind spark and then about 30 minutes about real example using spark for Machine learning. - Alex Tselikov:
Hands-on solving classification and regression competition on Kaggle: validation, feature engineering, ensembles. Examples from current competitions: “Home Credit Default Risk” , “Santander Value Prediction Challenge”.
18:00 -19:00 registration check, food, drinks and networking
19:00 - 19:15 Welcoming - Burda team
19:15 -20:00 Speaker1: Kunal Gautham
20:00 -20:30 Speaker2: Alex Tselikov
20:30 - 22:00 Hands-on Kaggle hacking
After that we split as usual in small group to hack problem in Kaggle.
I (Maher) spends time solving "Home Credit Default Risk" competition and I have until this moment about 35 submission with 0.772 score associated with top 57% competitors. I am expecting until the moment of the event to improve my score by adding more features. I am going to show my work and if you are interested in going forward with it, we can do it together.
We can always communicate using our new slack channel.
FEATURED COMPETITION
The idea is to have one focused topic where we do not have to start from scratch and can jump into the nitty-gritty of the competition. Besides the featured competition, we will of course still have groups working on their personal favorites!
AUDIENCE
No matter whether you are a new to data science or a Kaggle grandmaster, every data aficionado is warmly welcome to join the party and have fun hacking on Kaggle problems!
EQUIPMENT
Please bring curiosity, enthusiasm and your laptop with a running Python or R environment. See here ( https://www.kaggle.com/c/titanic/details/getting-started-with-python ) for help in setting up the environment. For a quick and painless start, you can also rely on the docker image: https://hub.docker.com/r/tensorflow/tensorflow/ .
