

What we’re about
This is a group for the people in the Greater Boston Area who are interested in the SQL query engine, Trino (formerly Presto SQL) (what it is, how it's used, who uses it, etc).
Trino is a high performance, distributed SQL query engine for big data. Its architecture allows users to query a variety of data sources such as Hadoop, AWS S3, Iceberg, MySQL, Cassandra, Kafka, Druid, MongoDB, and more! You can even run a federated query data from multiple data sources. Trino is community driven open-source software released under the Apache License.
Upcoming events
1
- Network event
•OnlineFrom Kafka to Iceberg: High-Throughput Ingestion at Starburst
Online2 attendees from 6 groupsJoin us for a focused, 45-minute online meetup which we’ll walk through our architecture, the key design decisions, and the lessons learned in building for scale. The result: a fully managed, easy-to-operate system benchmarked at 100 GB/s throughput, delivering cost-effective, near-real-time ingestion with exceptional query performance.
From Kafka to Iceberg: High-Throughput Ingestion at Starburst
Real-time analytics is critical for modern businesses — but bridging the gap between fast-moving Kafka streams and query-ready lakehouse tables remains a complex challenge. At Starburst, we encountered this firsthand while ingesting our internal telemetry data into Iceberg tables. Existing solutions fell short, plagued by issues such as lack of exactly-once guarantees, limited scalability, head-of-line blocking, small file proliferation, and high operational overhead and cost.
This prompted us to rethink streaming ingestion from the ground up. Our goal was a fully managed, highly available system that makes data actionable within minutes, optimized for both performance and usability. We designed and built a custom Kafka-to-Iceberg ingestion service that directly writes data in Iceberg format with strong guarantees, minimal latency, and continuous data maintenance for optimal query performance. Along the way, we developed novel techniques — such as Iceberg-aware commit coordination and adaptive Kafka consumer assignment — to overcome typical ingestion bottlenecks and deliver best-in-class price/performance.
Speaker: Lakshmikant (Pachu) Shrinivas, Staff Software Engineer at Starburst, hosted by Lester Martin, Developer Advocate at Starburst.
Past events
1
