Let's talk Kaggle


Details
Dear Kagglers,
after having returned from the summer vacation we start with a workshop and a talk on Friday and then will hack the whole saturday on Kaggle! :-)
Tentative Schedule
• 18:30 Doors open, get a drink and meet cool data people in Zalando's great sky lounge
• 19:00 Starting Data Science with Kaggle - Gerrit Gruben
• 20:45 Lessons from the Homedepot challenge - Andreas Merentitis
• 21:30 doors close, time to get to bed for Saturday's hack day!
Starting Data Science with Kaggle - Gerrit Gruben
To start off I will try to motivate you what the advantages are of doing Kaggle (compared to other activities) and how it may help you to overcome the dichotomy connected to the being of a Data Scientist.
After you got motivated I show you a setup for your Kaggle projects to make your Python data science projects isolated, reproducible and well structured to enable you to work with others without much fuzz.
I will bring together a bunch of tools and techs - most of you will probably know some of them - to empower you to direct your learning in a way that is as most "industry ready" as possible.
To show case these we start by Kaggling and form an example submission. Where we use the tools concretely and discuss some important topics from ML and statistics such as cross-validation.
This workshop is mainly intended for beginners, but due to the fast pace interesting for intermediate and above as well. The purpose is that - as a beginner - you would try it out ourself on the saturday after.
Lessons from the Homedepot challenge: predicting product search relevance - Andreas Merentitis
Abstract: The first part of the talk will briefly present Kaggle focusing on the reasons that drive companies to organize contests and data scientists to compete in them. The general framework of the typical contest will be also presented and the implications in terms of solutions and business answers derived will be discussed. The main part of the talk will focus on the specifics of the Home Depot Product Search Relevance contest, presenting the features and models that were used in the winning solution. Finally, lessons learned will be discussed and an attempt to abstract towards a general Kaggle recipe will be made.
Andreas is a Sr. Data Scientist at Zalando SE in Berlin and won together with his colleagues Alex and Nurlan the Home Depot challenge (that some people of our group tackled) on Kaggle. He brings decades of experience in Computer Science and got a PhD degree at the University of Athens researching on topics of wireless sensor networks.
=====================================================
As always everybody is welcome: let me know if you need anything (and write me an email if that is the case).
Looking forward to meet all of you.

Let's talk Kaggle