From Kafka to Iceberg: High-Throughput Ingestion at Starburst

Name: From Kafka to Iceberg: High-Throughput Ingestion at Starburst
Start: 2026-02-18T10:00:00-08:00
End: 2026-02-18T11:00:00-08:00

Network event

7 attendees from 6 groups hosting

Hosted by Trino by Starburst - Bay Area Meetup Group

Trino by Starburst - Bay Area Meetup Group

Details

Join us for a focused, 45-minute online meetup which we’ll walk through our architecture, the key design decisions, and the lessons learned in building for scale. The result: a fully managed, easy-to-operate system benchmarked at 100 GB/s throughput, delivering cost-effective, near-real-time ingestion with exceptional query performance.

From Kafka to Iceberg: High-Throughput Ingestion at Starburst
Real-time analytics is critical for modern businesses — but bridging the gap between fast-moving Kafka streams and query-ready lakehouse tables remains a complex challenge. At Starburst, we encountered this firsthand while ingesting our internal telemetry data into Iceberg tables. Existing solutions fell short, plagued by issues such as lack of exactly-once guarantees, limited scalability, head-of-line blocking, small file proliferation, and high operational overhead and cost.

This prompted us to rethink streaming ingestion from the ground up. Our goal was a fully managed, highly available system that makes data actionable within minutes, optimized for both performance and usability. We designed and built a custom Kafka-to-Iceberg ingestion service that directly writes data in Iceberg format with strong guarantees, minimal latency, and continuous data maintenance for optimal query performance. Along the way, we developed novel techniques — such as Iceberg-aware commit coordination and adaptive Kafka consumer assignment — to overcome typical ingestion bottlenecks and deliver best-in-class price/performance.

Speaker: Lakshmikant (Pachu) Shrinivas, Staff Software Engineer at Starburst, hosted by Lester Martin, Developer Advocate at Starburst.

AI summary

By Meetup

Online 45-minute meetup for data engineers on high-throughput Kafka-to-Iceberg ingestion, presenting a system benchmarked at 100 GB/s throughput.

AI summary

By Meetup

Online 45-minute meetup for data engineers on high-throughput Kafka-to-Iceberg ingestion, presenting a system benchmarked at 100 GB/s throughput.

From Kafka to Iceberg: High-Throughput Ingestion at Starburst

Trino by Starburst - Bay Area Meetup Group

Details

AI summary

AI summary

You may also like