Apache Beam meetup 1 at Criteo
We are excited to invite you to join us for the 1st Beam meet up in Paris.
We hope to be able to welcome you at Criteo this time (https://www.criteo.com) offices!
18:00 - Registrations and welcome.
18:45 - Welcome and introduction to Apache Beam (Matthias Baetens).
19:00 - 1st talk: Beam IOs (Jean-Baptiste Onofré)
19:30 - 2nd talk: Python Streaming Pipelines with Beam on Flink (Konstantin Knauf)
20:00 - 3rd talk: Building an interactive streaming SQL engine in Beam (Pablo Estrada)
20:45: pizza, drinks and networking
For the first talk, we welcome Jean-Baptiste Onofre (https://www.linkedin.com/in/jean-baptiste-onofr%C3%A9-a0739317/), fellow and software architect at Talend. As an ASF member and contributor to ~20 Apache projects, giving talks at Strata, and being one of the initial committers on the Beam project (as well as being a PMC member on project), JB will talk about the last developments in the IO field in Beam.
As a Solution Architect at Ververica, Konstantin helps our clients to solve their business problems with Apache Flink and Ververica Platform. In this role, he is also one of the first people our customers turn to if a streaming application is not performing as expected. Before joining Ververica he worked as a Senior Consultant with TNG Technology Consulting, where he supported their clients mainly in the areas of Distributed Systems and Automation. Konstantin has studied Mathematics and Computer Science at TU Darmstadt specializing in Stochastics and Algorithmics.
There is a number of new Systems that support running SQL queries on top of data streams. Apache Beam has not been the exception. It now provides a SQL transform that allows anyone to analyze streams from any source interactively and it supports running the same query on multiple runners including Flink, Spark, and Google Cloud Dataflow. We are also working to support mixed-language operations.
This talk starts by showing how the Beam SQL feature works, demoing a couple simple use cases: a pure SQL pipeline as well as SQL embedded in a Java pipeline. Then we review how streaming SQL came from collaboration between the Beam, Calcite and Flink communities and the advantages over other SQL implementations. Finally, we will deep-dive into the architecture of Beam’s implementation, as well as the design decisions that were taken along the way to build the feature, and how they have turned out.
Who should attend
Everyone interested in Data Engineering, Data Science and Machine Learning, who wants to learn about one of the newer and exciting Apache projects focused on batch & stream processing of data. We try to cover both business value as well as digging deeper technically.
Thanks to Criteo (https://www.criteo.com) for providing the space.