What we're about

Apache Flink Meetup Berlin is for developers interested in and using the open-source framework Apache Flink for distributed stream and batch data processing. Learn about Flink, its capabilities and use cases at https://flink.apache.org/flink-architecture.html . This meetup group is a place for you to talk to other developers and share your experiences!

What type of events do we host?

We organize community events featuring use case talks & demos by Flink users and sessions with Flink contributors, committers, and PMC members.

We’re always on the lookout for interesting talk ideas, so please feel free to submit yours: https://airtable.com/shrhqC4OSwG4cgKJf

📝Check out the latest documentation on how to get started with Flink:

https://ci.apache.org/projects/flink/flink-docs-stable/index.html (https://ci.apache.org/projects/flink/flink-docs-stable/index.html)

📢Follow @ApacheFlink on Twitter for news & updates: https://twitter.com/ApacheFlink

🐿️Explore Flink’s ecosystem of connectors, APIs, extensions, tools, and integrations: https://flink-packages.org/

💻Learn how to contribute to Flink: https://flink.apache.org/contributing/how-to-contribute.html

Policies

We’re committed to providing a friendly, safe and welcoming environment for all attendees, regardless of gender, sexual orientation, ability, ethnicity, socioeconomic status and religion (or lack thereof). We ask for all attendees to help create a safe and positive experience for everyone. Please do contact the meetup organizers if you witness or experience unacceptable behaviour.

* Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.

Upcoming events (1)

Streaming from Iceberg Data Lake & Multi Cluster Kafka Source

Online event

Streaming from Iceberg Data Lake
Steven Zhen Wu, Apple

Apache Iceberg brings numerous benefits like snapshot isolation, transactional commit, fast scan planning and time travel support. Those features solved important correctness and performance challenges for batch processing use cases. While originally adopted for batch, Iceberg can be leveraged as a streaming source. Streaming reads can further reduce the processing delay from hours to minutes compared to periodically scheduled batch ETL jobs.

In this talk, we are going to discuss how the Flink Iceberg source enables streaming reads from Iceberg tables, where long-running Flink jobs continuously poll and process data as soon as committed. We will discuss the design of the source operator focusing in particular on the streaming read mode. We will compare the Kafka and Iceberg sources for streaming read, and discuss how the Iceberg streaming source can power common stream processing use cases. Finally, we will present the performance evaluation results of the Iceberg streaming read

Multi Cluster Kafka Source
Mason Chen, Apple
Flink consumers read from Kafka as a scalable, high throughput, and low latency data source. However, there are challenges in scaling out data streams where migration and multiple Kafka clusters are required. Thus, we introduced a new Kafka source to read sharded data across multiple Kafka clusters in a way that conforms well with elastic, dynamic, and reliable infrastructure.

In this presentation, we will present the source design and how the solution increases application availability while reducing maintenance toil. Furthermore, we will describe how we extended the existing KafkaSource to provide mechanisms to read logical streams located on multiple clusters, to dynamically adapt to infrastructure changes, and to perform transparent cluster migrations and failover.

Photos (45)