Iceberg Query Performance at Scale: StarTree vs. Trino vs. ClickHouse Benchmark
10 attendees from 10 groups hosting
Details
To attend, register here.
A technical discussion of iceberg query performance benchmark results across 12.2B rows of Parquet data on S3 — including sub-second latency, CPU efficiency, caching behavior, and up to 15x lower cost per query.
Querying Iceberg data lakes on S3 gives platform teams flexibility, but interactive performance can become difficult to predict as data volumes, and concurrency grow. For teams supporting analytics, the question is not just which engine can query the lake — it is which architecture can deliver low latency without driving up compute, S3 reads, or operational overhead.
In this technical deep dive, we’ll walk through iceberg query performance benchmark results comparing StarTree, Trino, and ClickHouse on the same 12.2B-row Parquet dataset in S3, covering:
- Benchmark setup: How the systems were configured, what queries were tested, and how results were measured.
- Performance results: How each engine performed across latency, caching behavior, and query execution patterns.
- Resource efficiency: What the benchmark showed about CPU usage, S3 reads, and cost per query on identical infrastructure.
- Architecture tradeoffs: What the results reveal about scaling real-time analytics on Iceberg without moving or converting data.
Leave with a clearer understanding of how StarTree, Trino, and ClickHouse compare across practical Iceberg query workloads — and what to consider when designing for sub-second latency, efficient infrastructure usage, and predictable cost at scale.
