Skip to content

[In-Person + Online] Stream Processing with Apache Kafka, Samza, and Flink

Photo of Adem Efe Gencer
Hosted By
Adem Efe G. and 2 others
[In-Person + Online] Stream Processing with Apache Kafka, Samza, and Flink

Details

5:30 - 6:00: Networking [in-person only + catered food]
6:00 - 6:05: Welcome
6:05 - 6:40: [Datarios] Title: Visually Debugging Flink Incidents - Flink in 3D
Colten Pilgreen, Datarios
Are you new to stream processing and find it challenging to understand the behavior of your Flink pipelines? Or maybe you’re reluctant to use Flink for mission-critical use cases in production due to the difficulty of monitoring and lack of insights that Flink gives out of the box.
Join this session to discover a streamlined approach to incident resolution for Flink pipelines, empowering you to efficiently troubleshoot and maintain pipeline reliability. If you're seeking to enhance your pipeline observability and reduce incident response times, then you will not want to miss this event!
In this session, Colten Pilgreen will present

  • Unified Troubleshooting: Learn how the power of a centralized location reduces the time resolving incidents in your Flink pipelines.
  • Comprehensive Insights: Explore the insights you gain by correlating your critical metrics, evolution of state values, erroneous logging messages with incoming records that led up to your incident.

​​Colten Pilgreen is Datorios’ Flink Advisor. He has been an invested user of the Flink Framework for the past 4 years in both the transportation industry and now in the Ad Tech space at Hulu. He comes to the table with 9 years of Data Engineering experience, but has been tinkering in various programming spaces for the past 15 years. Colten has been a key member of the teams he’s worked with from designing, implementing, and troubleshooting a whole host of Flink Pipelines; Along with standing up, managing, and further improving the required infrastructure to run and monitor these pipelines he’s responsible for.

6:40 - 7:15: [StreamNative] Title: Ursa: Kafka-compatible data streaming on Lakehouse
Matteo, StreamNative
Ursa is a Kafka-compatible data streaming engine built on top of a lakehouse, enabling users to store their topics and associated schemas directly in lakehouse tables. Ursa utilizes the innovations that StreamNative has developed to evolve Pulsar's storage layer from a disk-based shared storage layer to an object storage-based tiered storage system and to integrate with the lakehouse ecosystem. The Ursa engine simplifies the integration between data streams and lakehouse tables, drastically reducing the complexity of using bespoke integrations. In this talk, we will dive deeper into the details of the Ursa engine and how it leverages the lakehouse as a storage backend.

Matteo is the CTO at StreamNative, where he brings rich experience in distributed pub-sub messaging platforms. Matteo was one of the co-creators of Apache Pulsar during his time at Yahoo!. Matteo and Sijie worked to create a global, distributed messaging system for Yahoo!, which would later become Apache Pulsar. Matteo then co-founded Streamlio with Guo, and later served as the Senior Principal Software Engineer at Splunk post-acquisition. Matteo is the PMC Chair of Apache Pulsar, where he helps to guide the community and ensure the success of the Pulsar project. He is also a PMC member for Apache BookKeeper. Matteo lives in Menlo Park, California.

7:15 - 7:50: [LinkedIn] Title: Combine Apache Beam and Flink SQL for Stream and Batch Unification
Jiangjie (Becket) Qin, LinkedIn
Currently, there are two APIs for streaming application developers at LinkedIn, Apache Beam as the programming API and Apache Flink SQL. And we see some legitimate use cases that can benefit from writing an application with the two APIs mixed. Therefore, we developed a Flink SQL PTransform that allows users to embed SQL processing logic inside a Beam pipeline. This talk will introduce the use cases it tries to solve, as well as how we achieve that.

Becket is currently a Principal Staff Software Engineer at LinkedIn. He started to work on Apache Kafka at LinkedIn after he graduated from Carnegie Mellon University. After that, he joined Alibaba and led the Flink team focusing on Flink SQL, PyFlink, Flink ML, Connectors, among others. He returned to LinkedIn in 2022 to drive the effort of stream and batch unification.
Becket is a PMC member of Apache Kafka and Apache Flink.

Photo of Stream Processing with Apache Kafka, Samza, and Flink group
Stream Processing with Apache Kafka, Samza, and Flink
See more events