Building Your Data Project from Scratch: Best Practice and Use Cases


Details
After many successful meetups in NYC, London and Paris Dataiku finally is bringing it's meetup series to Germany! We invite you to join us on October 16th to learn more about data science in two one-of-a-kind talks!
First Talk: "Transfer Leaning for Cold Start Problem"
The cold start problem is one of the biggest challenges when applying machine learning in industry. The goal is to build an accurate production-ready machine learning model when there is either no data or not enough data available to train a model - the so-called cold start. In this case methods like transfer learning prove to be valuable by using a model that was trained on a different but related task.
In her talk Jekaterina will guide you through a real-world cold start problem she was working on in payment fraud detection at Zalando. The business case aims at building a payment fraud detection model for a new market, where there is no historical data available. The challenge is how to combine the data from existing markets in an elaborate way so that it generalises to a new market. Jekaterina will discuss several possible solutions such as transfer learning, domain adaptation and domain generalisation. Additionally, she will explore how some of the techniques can be extended to be used on a stream of unlabeled or limited labeled data.
Speaker Bio: Jekaterina Kokatjuhha
Jekaterina is a Research Engineer at Zalando, focusing on scalable machine learning for fraud prediction. Jekaterina obtained a masters degree in bioinformatics from FU Berlin and worked in various research institutions across Europe such as the Charité Hospital in Berlin, the Centre for Genomics Regulations in Barcelona and at Manchester University.
_______________________________________________________
Second Talk: "How to Build a Basic Website based on Real-Time Prediction"
Dataiku recently launched a side project called Human or Company: a web page where anyone can enter a Twitter username and instantly determine whether that username belongs to a person or a company.
In this talk, Product Manager Jeremy Greze explains how the model behind the algorithm was created, the features that influence the classification, and how the site was build to respond to real-time requests. Although this is a low-key and modest side project, it is a great example of building a real-time prediction service with advanced analytics, and it's a way of showing that trendy terms like machine learning or artificial intelligence can be used in simple and small (but effective!) ways.
Speaker Bio: Jeremy Greze
Product Manager at Dataiku
Please, don't forget to RSVP!


Building Your Data Project from Scratch: Best Practice and Use Cases