June Edition: Streaming and Analytics
Details
We're excited to be back with another evening Database Internals!
This time, we’ll explore how modern data platforms are moving beyond batch-oriented architectures toward fresh, streaming-first, low-latency systems. With talks on Flink, Fluss, Kafka, and real-time context for agents, the evening will look at where data architecture is heading next.
As always, there will be plenty of time for networking, discussion, and snacks 🍕.
Event Details:
📍 Venue: Mindspace, Herzogspitalstraße 24
📅 Date & Time: Wednesday, June 10 2026, doors open at 18:30
- Streaming Really Large Data with Flink and Fluss
Abstract: For years, data platforms have evolved from warehouses, to lakes, to lakehouses built on formats like Apache Iceberg. But AI and real-time applications are pushing those systems to their limits. They need fresh data, low latency, and streaming access without maintaining separate systems for analytics and operations.
In this talk, Ben Gamble, Field CTO at Ververica, explores the idea of the “Streamhouse”: a streaming-first architecture built around Apache Flink and Apache Fluss. Rather than treating streams and tables as separate worlds, Fluss provides a unified layer for real-time data processing and storage.
We’ll look at how Fluss uses Apache Arrow’s columnar format to make streaming more efficient, reduce infrastructure costs, and support fast analytical access to live data. Along the way, we’ll explore why these ideas matter for modern AI systems, operational analytics, and next-generation data platforms.
If you are interested in streaming systems, real-time analytics, or AI infrastructure, this talk offers a practical look at where data architecture is heading next.
Speaker Bio: Ben Gamble is Field CTO at Ververica, the original creators of Apache Flink. Based in Cambridge, UK, he brings engineering leadership experience across logistics, gaming, and mobile applications, and now works at the intersection of real-time stream processing, AI, and cloud architecture.
- Realtime Context Engine - a realtime analytical database
Abstract: As LLM applications move from chatbots to agents, they need a standard way to fetch operational data instead of relying only on RAG or periodically synced databases. MCP is becoming that interface, but it does not solve the underlying serving problem: turning streaming data into fresh, low-latency, governed context that an agent can query reliably. Real-Time Context Engine (RTCE) sits in that layer by materializing state from Kafka streams and exposing it through MCP. Lightning Tables is the new analytics database that backs RTCE. It is the Kafka-native serving and analytics engine built to query live data from Kafka and Iceberg tables via Tableflow using an Arrow in-memory layout, vectorized execution, and locality-aware scheduling. In this talk, we will walk through the architecture and the performance tradeoffs around freshness, point lookups, scans, and concurrency.
Speaker Bio: Ryan Murray is a Director at Confluent, leading the Lightning Table and Tableflow teams. He is a founder, an Apache Iceberg committer, and has done everything from database engineering to bond trading to theoretical physics. Ryan still dreams of winning the Stanley Cup one day.
Agenda:
🔹 18:30: Doors Open
🔹 18:40: Welcome
🔹 18:45: Talk #1: Ben Gamble (Ververica)
🔹 19:30: Pizza & Networking 🍕
🔹 20:00: Talk #2: Ryan Murray (Confluent)
