Batch Data Processing at Spotify

Details
NYC Data Engineering hosts events where the lead engineers from pre-screened companies showcase their technical challenges.
ONLY ENGINEERS can attend. If you are not an engineer, please be aware that you may be removed you from the list.
Speaker: Erik Bernhardsson, Tech Lead, Spotify
Erik is the Tech Lead of the discovery team at Spotify. He is building a music recommender system using large scale machine learning algorithms, mainly matrix factorization of big matrices using Hadoop.
Previously head of the Business Intelligence team in the Stockholm office where he was responsible for collecting, aggregating and making sense out of all the data.
About the talk:
Erik will be talking about Luigi, a recently open-sourced Python framework that helps you build complex pipelines of batch jobs, handle dependency resolution, and create visualizations to help manage multiple workflows.
It also comes with Hadoop support built in (and that’s where really where its strength becomes clear). Luigi provides an infrastructure that powers several Spotify features including recommendations, top lists, A/B test analysis, external reports, internal dashboards, and many more.
Read more about it on Github: https://github.com/spotify/luigi
Agenda
7:00 - 7:30 Doors open, pizzas and beer
7:30 - 8:30 Presentation by Erik
8:30 - 9:00 Wrap up, chat with other data engineers


Batch Data Processing at Spotify