DataTalks #6: DataHack Champions


Details
https://a248.e.akamai.net/secure.meetupstatic.com/photos/event/7/7/d/b/600_449850683.jpeg
DataTalks @ Taboola (https://www.taboola.com/)
DataTalks (http://datahack-il.com/) #6: DataHack (http://datahack-il.com/) Champions.
Our sixth meetup will be hosted by Taboola (https://www.taboola.com/), and will feature cool past projects done in DataHack (http://datahack-il.com/).
Language: Hebrew
Location: Atrium Tower, 2 Jabotinsky St., 32nd fl., Ramat Gan
Schedule:
• 18:00 - 18:15 - Gathering, snacks & mingling
• 18:15 - 18:20 - Opening words
• 18:20 - 19:00 - First talk:
Sraia Louis, The Hebrew University of Jerusalem - Using graphs to predict ship type according to ship behavior
• 19:00 - 19:10 - A short break
• 19:10 - 19:50 - Second talk:
Seffi Cohen - Ensemble models approach for cab ride duration prediction
• 20:00 - 20:40 - Third talk:
Doron Kukliansky, Google - Data Driven Video Creation
==== Talk #1 ===
Speaker: Sraya Louis, The Hebrew University of Jerusalem
Title: Using graphs to predict ship type according to ship behavior
Abstract: Given the behavior of ships such as port visits and ship-to-ship meetings - we are trying to categorize ship type based on ship behavior: oil, container, fishing etc. In this talk we will discuss how engineering new features based on the graph that a ship spans can capture a ship's behavior and thus improve classification accuracy. We will present the problem, the mathematical tools and some intuition - and for the fun we will conclude with failure points (and possible solutions).
==== Talk #2 ===
Speaker: Seffi Cohen
Title: Ensemble models approach for cab ride duration prediction
Abstract: In this talk I'll share how we attempted to predict a cab ride duration using various generated features and models, and how we settled on a model ensemble approach to utilize the advantages of different models. I will talk about ensemble methods, how to choose a model that will give good results in a short amount of time, how to engineer and choose good features and share lessons learned from multiple kaggle competitions and being one of the winning teams in DataHack for two years in a row.
==== Talk #3 ===
Speaker: Doron Kukliansky
Title: Data Driven Video Creation
Abstract: In this talk we will discuss our DataHack project in which we attempted to generate new episodes of The Simpsons, using data science tool. We will see the general approach, the data we had, but more importantly, the data we did not have and how we compensated for it. We will also deep dive into two technical problems we encountered during the project and are of general interest:
-
The first is speaker recognition, for which we'll discuss the MFCC features and how they can be used for classification.
-
The second is semantic sentence similarity, for which we'll discuss the Word Mover's Distance, it's origin and usage.
- Prior familiarity with The Simpsons isn't necessary but is an advantage.
https://secure.meetupstatic.com/photos/event/c/8/4/d/600_463791277.jpeg

DataTalks #6: DataHack Champions