Building data pipelines using Luigi, with Erik Bernhardsson of Spotify

Details

For our next meetup we are pleased to welcome Erik Bernhardsson (http://erikbern.com/), the Engineering Manager for Music Discovery & Machine Learning at Spotify.

Erik will be talking about Luigi (https://github.com/spotify/luigi), a powerful and versatile open-source framework for building data pipelines, of which he is one of the principal authors. Luigi is a Python module for building automated data pipelines for complex workflows—it provides dependency management, step-by-step workflow execution, and visualization, and is designed to be scalable and fault-tolerant. Luigi has found use at Spotify (of course) and lots of other companies—for two great examples, check out how Asana (https://eng.asana.com/2014/11/stable-accessible-data-infrastructure-startup/) and Buffer (https://overflow.bufferapp.com/2014/10/31/buffers-new-data-architecture/) use it to orchestrate complex analytics pipelines.

Erik's bio: "Swedish computer nerd living in NYC. I graduated with a master's degree in Physics from KTH in Stockholm, but I've been writing code for 20+ years. My work has ranged from embedded systems to high frequency trading to machine learning. Since joining Spotify in 2009, I've designed and built many large-scale machine learning algorithms we use to power the recommendation features. I've led the team that built and released features like the radio feature, the 'Discover' page, 'Related Artists', and much more."

Pizza, beer and soft drinks will be provided, co-sponsored by Mortar (https://www.mortardata.com/) and Spotify (https://www.spotify.com/us/).