Wiggly Air and Data Processing at Spotify


Details
NOTE: 4 WTC building security requires photo ID for visitors. Please RSVP with your real name that matches your government ID to ensure a smooth process check in process. We will close RSVP on Feb 20 Thu at 10AM.
Direction: check in at the the Spotify concierge on the left of the lobby, ask for Beam Meetup, use elevator bank E and head to 71F.
Food and drinks sponsored by Spotify
========================================
Talk: Scio - Data Processing at Spotify - Claire McGinty
Bio:
Claire is a data infrastructure engineer at Spotify, working primarily on Scio and other data processing tooling. Prior to Spotify she worked on newsfeed recommendations at LinkedIn.
Abstract:
Scio is an open-source Scala API for Apache Beam and Google Cloud Dataflow. It’s created by Spotify to process petabytes of data in both batch and streaming mode, and is adopted by a dozen other companies as well. We’ll talk about some features that make it stand out from other Scala big-data frameworks, including our uses of Algebird, macros, shapeless, magnolia, etc. to make large scale data processing easier, safer, and faster.
========================================
Title: Music Is Just Wiggly Air: Streaming Signal Processing in Python at Scale - Lynn Root
Bio:
Lynn Root is an SRE at Spotify with historical issues of using her last name as her username, and the resident FOSS evangelist. She is also a global leader of PyLadies and former Vice Chair of the Python Software Foundation Board of Directors. When her hands are not on a keyboard, they are usually holding a bass guitar or a paint brush.
Abstract:
Digital signal processing (DSP) has been made easy with the help of many Python libraries, allowing engineers and researchers to quickly and effortlessly analyze audio, images, and video. However, scaling these algorithms and models to process millions of files has not been equally as seamless.At Spotify, we’re trying to address scaling DSP over our catalog of over 50 million songs. This talk will discuss the challenges we’ve encountered while building the infrastructure needed to support signal processing at scale. I’ll discuss the how we’ve leveraged Apache Beam for streaming data pipelines and the tooling we’ve built on top of Beam to support our heavy resource requirements.

Wiggly Air and Data Processing at Spotify