Big Data & Analytics meetup 2020/02 - Distributed stream processing


Details
Online streaming! You can watch the talks here:
https://cloudera.zoom.us/j/236617782
This meetup will focus on streaming data technologies such as Structured Streaming in Apache Spark and the Apache Flink Streaming engine.
Planned speakers and talks:
- Streaming Technologies Intro
Marton Balassi, Cloudera
This talk will provide an overview of the current streaming technology landscape.
Marton is an Engineering Manager at Cloudera. He is an Apache Flink PMC member and one of the first contributors to the streaming API. He has driven big data adoption at around 50 customers as a Senior Solutions Architect at Cloudera during the last four years. He is the manager of the recently formed Streaming Analytics team and focuses on adding Flink to the Cloudera platform.
- Introduction to Spark Streaming
Gabor Somogyi, Cloudera
Spark's latest streaming engine is Structured Streaming which is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. One can express streaming computation the same way you would express a batch computation on static data. The Spark SQL engine will take care of running it incrementally and continuously and updating the final result as streaming data continues to arrive.
Gabor is a Software Engineer at Cloudera and an Apache Spark contributor who made major improvements in Spark's Kafka connector in Spark 3.0.
- What's new with Flink Streaming
Gyula Fora, Cloudera
Apache Flink Streaming is a low latency, distributed data processing engine that focuses on stateful jobs and sophisticated windowing. Enterprises across industries rely on Flink to deliver latency critical business value in a range of use cases including Alibaba, Netflix or ING Bank. The Flink community is focused on refining its SQL API in the latest release to democratize stream processing with opening towards the BI analyst community.
Gyula is a Software Engineer in the Flink Engineering team at Cloudera working on integrating Flink into the Cloudera platform. He has been a committer and contributor since the early days of Flink streaming and has used Flink in large scale production at King for almost 4 years delivering innovative real-time applications at a global scale.
Schedule:
18:00 Doors open
18:30 Talks begin
20:00 Followup discussion
This event is jointly organized with the Future of Data: Budapest meetup group (http://bit.ly/3beBuoR)
Venue and catering will be provided by Cloudera Hungary. This is an English speaking event.

Big Data & Analytics meetup 2020/02 - Distributed stream processing