17:30-18:30 - Pizza & Drinks
18:30-19:30 - Data Science at the Command Line
Jeroen Janssens, Senior Data Scientist @ YPlan
The *nix command line, although invented decades ago, is an amazing environment for doing data science. By combining small, yet powerful, command-line tools we can really explore our data and quickly hack together prototypes. The recent addition of tools such as GNU Parallel, jq, and, Drake, further enables us to be more productive and more efficient data scientists. Installing these command-line tools and setting up an efficient environment is, unfortunately, not straightforward.
In the first part of this talk I will present a new open-source project called the Data Science Toolbox, which is a virtual environment that allows you to get started doing data science in minutes. It comes with commonly used software for data science and allows for easy installation of additional tools. Because the Data Science Toolbox runs on top of VirtualBox, it can be installed not only on Linux, but also on Mac OS X and Microsoft Windows.
Once you have a solid environment, it is worthwhile to further customize it to your own needs. In the second part of the talk I will explain how to (1) make your environment more efficient and (2) create reusable command-line tools from one-off commands or from existing code in, for example, Python and R.
By the end of this talk you will have a solid understanding of how to leverage the power of the command line for your next data science project.
Jeroen Janssens is a senior data scientist at YPlan, tonight's going out app, where he's responsible for making event recommendations more personal. Jeroen holds a Ph.D. in Machine Learning from Tilburg University. He is authoring a book called "Data Science at the Command Line", which will be published by O'Reilly in summer 2014. Jeroen enjoys biking the Brooklyn Bridge, building tools, and blogging at http://jeroenjanssens.com. He can be found on Twitter @jeroenhjanssens.
19:30-19:45 - Break
19:45-20:45 - TomTom’s use of Traffic Data
Ralf-Peter Schäfer, Fellow & VP Traffic and Travel Information Product Unit @ TomTom
Every day, TomTom's navigation devices generate about 10 billion data points about the speed and position of cars. TomTom uses this data to provide real-time traffic information, calculate accurate travel times, and much more. Mr. Schäfer will discuss how the data collection process works, how to manage such a big amount of data, and how TomTom creates value from it by sophisticated analysis.
20:45-21:30 - Drinks