Talk: 7 - 7:45 PM
Q&A: 7:45 - 8 PM
Networking: 8 - 8:30 PM
Title: Stream processing with Apache Beam
Abstract: In this talk, we present the new Python SDK for Apache Beam - a parallel programming model that allows one to implement batch and streaming data processing jobs that can run on a variety of execution engines like Apache Spark and Google Cloud Dataflow. We will use examples to discuss some of the interesting challenges in providing a Pythonic API and execution environment for distributed processing.
Speaker Bio: Sourabh Bajaj is a software engineer at Google interested in Data Infrastructure and Machine Learning. He currently works on Tensorflow and Apache Beam. Prior to Google, he was part of the Data Science team at Coursera working on everything from Recommendation System to Data warehousing.