

What we’re about
This group brings together data engineers, analysts, and developers in the Chicago area who are building the future of analytics with Trino, Starburst, and the open data stack.
Whether you’re running queries across diverse sources or designing the next generation of lakehouseross clouds, on-prem, or hybrid environments. Together, they power the modern data lakehouse for analytics and AI, enabling open, federated access without complex migrations.
Our meetups explore topics like AI-driven analytics, Iceberg adoption, data governance, and performance at scale, featuring talks from community members, contributors, and industry leaders.
Whether you’re running queries across diverse sources or designing the next generation of lakehouse architectures, this community is where we connect, learn, and share insights—because data grows stronger when it’s connected.
Upcoming events
1
- Network event
•OnlineFrom Kafka to Iceberg: High-Throughput Ingestion at Starburst
Online1 attendee from 6 groupsJoin us for a focused, 45-minute online meetup which we’ll walk through our architecture, the key design decisions, and the lessons learned in building for scale. The result: a fully managed, easy-to-operate system benchmarked at 100 GB/s throughput, delivering cost-effective, near-real-time ingestion with exceptional query performance.
From Kafka to Iceberg: High-Throughput Ingestion at Starburst
Real-time analytics is critical for modern businesses — but bridging the gap between fast-moving Kafka streams and query-ready lakehouse tables remains a complex challenge. At Starburst, we encountered this firsthand while ingesting our internal telemetry data into Iceberg tables. Existing solutions fell short, plagued by issues such as lack of exactly-once guarantees, limited scalability, head-of-line blocking, small file proliferation, and high operational overhead and cost.
This prompted us to rethink streaming ingestion from the ground up. Our goal was a fully managed, highly available system that makes data actionable within minutes, optimized for both performance and usability. We designed and built a custom Kafka-to-Iceberg ingestion service that directly writes data in Iceberg format with strong guarantees, minimal latency, and continuous data maintenance for optimal query performance. Along the way, we developed novel techniques — such as Iceberg-aware commit coordination and adaptive Kafka consumer assignment — to overcome typical ingestion bottlenecks and deliver best-in-class price/performance.
Speaker: Lakshmikant (Pachu) Shrinivas, Staff Software Engineer at Starburst, hosted by Lester Martin, Developer Advocate at Starburst.
Past events
1
