Needs a location
RSVP on meetup is turned off, make sure to register at the link.
We are partnering with Google to host a series of tech events to learn and practice data analytics, processing with Apache Beam. This is the 1st session (2nd session is on 13 May, 10 AM PT, register on the website)
Start date: 6 May, 10AM PT (US pacific time, GMT-7) , double check your local time.
Apache Beam is an open source, unified model for defining both batch and streaming data-parallel processing pipelines. Using one of the open source Beam SDKs, you build a program that defines the pipeline. The pipeline is then executed by one of Beam’s supported distributed processing back-ends, which include Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow.
In this talk, we will be introducing Apache Beam using Jupyter Notebooks by live coding both a batch and streaming pipeline using publicly available COVID-19 data.
Samuel Rohde is a Software Engineer at Google and has been working for the Cloud Dataflow team for the past 5 years. He graduated from UIUC. Sam has been contributing to the Apache Beam source code for the past couple of years.
Ning Kang is a member of the Google Cloud Dataflow team, and has been contributing to the Apache Beam Interactive Notebook OSS project. Before that, he was a software engineer in the Google Store team where he helped with 3 large hardware (pixel phone and etc.) sales events. Before joining Google, he worked in the EMR software industry