Fresha Data Meetup: Sweet Streams Are Made of ThisJoin us on May 7th from **6:00pm** for a Data Streaming meetup hosted by **[Fresha](https://www.fresha.com/)**!
📍**Venue:**
**Fresha**
The Tower, 207 Old Street
London, EC1V 9NR
7th Floor
**PLEASE bring your PHOTO ID and REGISTER with your First and Last Name, Email. Thanks!**
**DOORS CLOSE AT 7PM FOR SECURITY.**
🗓 **Agenda:**
* 6:00pm – 6:30pm: Food/Drinks and Networking
* 6:30pm - 7:00pm: Anton Borisov, Principal Data Engineer, Fresha & Nicoleta Lazar, Sr. Data Engineer, Fresha
* 7:00pm - 7:30pm: Olena Kutsenko, Staff Developer Advocate, Confluent
* 7:30pm - 8:00pm: Tom Scott, Founder & CEO, Streambased
* 8:00pm - 9:00pm: Q&A Networking
💡**Speaker One:**
Anton Borisov, Principal Data Engineer, Fresha & Nicoleta Lazar, Sr. Data Engineer, Fresha
**Title of Talk:**
Apache Fluss: Streaming Storage for Real-Time Analytics
**Abstract:**
Apache Flink has largely solved streaming computation, but state management remains a bottleneck. Operator state—often backed by RocksDB—is scoped to individual jobs, making it hard to share state across pipelines without re-emitting it through systems like Apache Kafka. This leads to expensive joins and added operational complexity.
Apache Fluss (incubating) addresses this by externalizing state into a streaming storage layer backed by S3. Its PrimaryKey tables act as upsertable key-value stores with replicated changelogs, enabling shared state tables accessible from Flink, Spark, or native clients. With Flink 2.1’s Delta Join, both sides of a join can perform lookups against Fluss-managed indexes—eliminating the need for large join state altogether. In this talk, we’ll introduce Fluss, show key use cases like CDC enrichment and bilateral joins, and share lessons from running it in production on EKS —including the trade-offs and rough edges we encountered.
💡**Speaker Two:**
Olena Kutsenko, Staff Developer Advocate, Confluent
**Title of Talk:**
The journey of a record from Kafka topic to analytics table
**Abstract:**
Modern data platforms increasingly rely on streaming systems like Apache Kafka as the source of truth—but turning event streams into reliable, queryable analytics tables is far from trivial.
This talk explores what really has to happen when moving data from a Kafka topic into an Apache Iceberg table.
Beyond simple ingestion, we will walk through the essential steps: schema evolution, data normalization, partitioning, exactly-once guarantees, late-arriving events, and table maintenance.
Using Confluent Tableflow as one concrete example, we will examine how these challenges can be addressed in practice. Along the way, we will compare alternative approaches—such as Kafka Connect sinks, stream processing pipelines, and custom ingestion frameworks—to highlight trade-offs in correctness, latency, and operational complexity.
The goal is not to promote a single solution, but to provide a mental model for designing robust streaming-to-lakehouse pipelines, helping you understand what matters regardless of the tooling you choose.
**Bio:**
Olena is a Staff Developer Advocate at Confluent and a recognized expert in data streaming and analytics. With two decades of experience in software engineering, she has built mission-critical applications, led high-performing teams, and driven large-scale technology adoption at industry leaders like Nokia, HERE Technologies, AWS, and Aiven.
A passionate advocate for real-time data processing and AI-driven applications, Olena empowers developers and organizations to use the power of streaming data. She is an AWS Community Builder, a dedicated mentor, and a volunteer instructor at a nonprofit tech school, helping to shape the next generation of engineers.
As an international speaker and thought leader, Olena regularly presents at top global conferences, sharing deep technical insights and hands-on expertise. Whether through her talks, workshops, or content, she is committed to making complex technologies accessible and inspiring innovation in the developer community.
💡**Speaker Three:**
Tom Scott, Founder & CEO, Streambased
**Title of Talk:**
*Infinite Kafka? Rethinking Retention with Iceberg*
**Abstract:**
Apache Kafka is designed for high-throughput, low-latency event streaming, not cost-efficient long-term storage. Yet we constantly see cases like event sourcing, audit/compliance, and large-scale reprocessing forcing that pattern onto it.
As retention increases, costs grow linearly to support edge cases, one-time runs, and “checkbox” use cases.
Can Iceberg help here?
In this talk, Tom explores a hybrid architecture that separates hot and cold data while preserving Kafka’s log semantics. Using a combination of Kafka and Apache Iceberg, he demonstrates how to extend Kafka into low-cost object storage, enabling effectively unlimited retention without sacrificing performance or access patterns.
The result is a unified log that supports both real-time processing and long-term replay, removing the traditional trade-off between cost and capability in Kafka-based systems.
**Bio:**
Long-time enthusiast of Kafka and all things data integration, Tom has more than 15 years of experience in innovative and efficient ways to store, query, and move data. Tom is currently CEO at Streambased, a company focused on unifying operational and analytical data estates into a single, consistent, and efficient data layer.
\*\*\*
If you are interested in hosting/speaking at a meetup, please email community@confluent.io