Introducing Scio, a Scala API for Google Cloud Dataflow


Details
(Note: Special thanks to our friends at the NYC Data Engineering (https://www.meetup.com/NYC-Data-Engineering/) for partnering with us on this event!)
Talk 1 - Scio, a new Scala API for Google Cloud Dataflow
Neville Li, Spotify
https://github.com/spotify/scio
About the talk:
Learn about Scio, a Scala API for Google Cloud Dataflow (incubated as Apache Beam). Apache Beam offers a simple, unified programming model for both batch and streaming data processing while Scio brings it much closer to the high level API many data engineers are familiar with, e.g. Spark and Scalding. Neville will cover design and implementation of the framework, including features like type safe BigQuery macros, REPL, and serialization. There will also be some live coding demo.
Bio:
Neville is a software engineer at Spotify who works mainly on data infrastructure and tools for machine learning and advanced analytics. In the past few years he has been driving the adoption of Scala and new data tools for music recommendation, including Scalding, Spark, Storm and Parquet. Before that he worked on search quality at Yahoo! and old school distributed systems like MPI.
Talk 2 - Google engineer on Google Cloud Platform (TBD)
Schedule
6:30pm - Doors open & food/drinks
7:00pm - Talks
8:00pm - Q/A
8:15pm - Additional socializing w/ speakers & other awesome data engineers

Introducing Scio, a Scala API for Google Cloud Dataflow