Real-time Analytics meetup London

4.7•3 ratings

About us

The Real-Time Analytics meetup covers a range of topics around building Real Time Analytics systems; including use cases, technical deep dives, and best practices.
Interested in speaking, organizing, or volunteering? Contact community@startree.ai

This meetup is organized by the founders of StarTree and original creators of Apache Pinot:
Apache Pinot is a realtime distributed OLAP datastore, used to deliver scalable real time analytics with low latency. It can ingest data from batch data sources (S3, HDFS, Azure Data Lake, Google Cloud Storage) as well as streaming sources (such as Kafka). Pinot is used extensively at LinkedIn and Uber to power many analytical applications such as Who Viewed My Profile, Ad Analytics, Talent Analytics, Uber Eats and many more serving 200k+ queries per second while ingesting 1Million+ events per second.

Resources
> • What is Apache Pinot? https://www.startree.ai/what-is-apache-pinot
> • Launching At LinkedIn: The Story of Apache Pinot: https://www.startree.ai/blog/launching-at-linkedin-the-story-of-apache-pinot
> • For more info on Apache Pinot go to dev.startree.ai
> •Our community is active on slack! To join our slack, go to stree.ai/slack

Upcoming events

See all

Network event
Iceberg Query Performance at Scale: StarTree vs. Trino vs. ClickHouse Benchmark
Wed, May 20 · 6:00 PM BST
·
Online
Online
0 attendees from 9 groups
To attend, register here.

A technical discussion of iceberg query performance benchmark results across 12.2B rows of Parquet data on S3 — including sub-second latency, CPU efficiency, caching behavior, and up to 15x lower cost per query.

Querying Iceberg data lakes on S3 gives platform teams flexibility, but interactive performance can become difficult to predict as data volumes, and concurrency grow. For teams supporting analytics, the question is not just which engine can query the lake — it is which architecture can deliver low latency without driving up compute, S3 reads, or operational overhead.
In this technical deep dive, we’ll walk through iceberg query performance benchmark results comparing StarTree, Trino, and ClickHouse on the same 12.2B-row Parquet dataset in S3, covering:
- Benchmark setup: How the systems were configured, what queries were tested, and how results were measured.
- Performance results: How each engine performed across latency, caching behavior, and query execution patterns.
- Resource efficiency: What the benchmark showed about CPU usage, S3 reads, and cost per query on identical infrastructure.
- Architecture tradeoffs: What the results reveal about scaling real-time analytics on Iceberg without moving or converting data.
Leave with a clearer understanding of how StarTree, Trino, and ClickHouse compare across practical Iceberg query workloads — and what to consider when designing for sub-second latency, efficient infrastructure usage, and predictable cost at scale.