Skip to content

Uber x Apache Pinot Meetup

Photo of Uber Engineering
Hosted By
Uber E.
Uber x Apache Pinot Meetup

Details

We welcome you to join us for an in-person meetup hosted by Uber and the Apache Pinot community. We will kick off with opening remarks by Shanshan Song, Senior Director of Engineering, Storage, Search and Data at Uber followed by a series of three interesting talks by the speakers from Uber, StarTree, and LinkedIn.

Event Details

  • This Meetup is a co-hosted by Uber and StarTree
  • This Meetup is an in-person event only
  • Registration is required for the Meetup. Please RSVP & answer the questions (full name & email address)
  • Event location details will be emailed a few days before the event to those who have registered for the event and provided an email address (not located at pin location)

Event Agenda

  • 5:00 PM - Networking and Snacks
  • 5:30 PM - Opening Remarks
  • 5:40 PM - Talk 1 - Uber
  • 6:10 PM - Talk 2 - StarTree
  • 6:40 PM - Talk 3 - LinkedIn
  • 7:10 PM - Closing Remarks

Presentation Info

  • Talk 1: Powering Real-Time Observability: Distributed Tracing on Apache Pinot at Uber
    Modern distributed systems generate vast amounts of tracing data, crucial for observability and diagnostics. At Uber, we leverage Jaeger for tracing and have integrated Apache Pinot as its high-performance backend for both indexing and primary storage. This talk will showcase how we utilize Pinot to ingest and query massive volumes of trace data in real time, delivering low-latency analytics at a highly efficient cost. Join us for a demo and deep dive into our architecture, highlighting how Pinot's capabilities supercharge our tracing system.
    - Speaker Bios:
    Xin Gao is a Staff Software Engineer in Uber's Real-Time Analytics (Pinot) team. He has played a key role in integrating Apache Pinot with Uber's large-scale logging and tracing observability systems. Xin specializes in enhancing Pinot's real-time ingestion performance, focusing on latency reduction, storage efficiency, and developing features like multi-topic ingestion to meet versatile user needs.
    Chen Xu is a Staff Software Engineer in Uber's Observability team, focusing on distributed tracing. He plays a crucial role in evolving storage solutions and enhancing query capabilities, significantly improving real-time troubleshooting for the company.
  • Talk 2: Pauseless Consumption in Apache Pinot
    Apache Pinot is a real-time OLAP database built for ultra-low latency analytics at scale. While it supports real-time ingestion, segment commit operations can introduce pauses ranging from a few seconds to, in rare cases, several minutes—posing a challenge for use cases that demand always-fresh data. In this talk, we'll explore the design and implementation of the new pauseless consumption feature in Pinot, which enables truly continuous ingestion without waiting for segment builds. Join us to learn about the architectural challenges we faced, how we overcame them, and what this means for real-time analytics going forward.
    - Speaker Bio:
    Xiaotian (Jackie) Jiang is a Founding Engineer & Chief Architect at StarTree, PMC for Apache Pinot. He is profoundly passionate about real-time user-facing analytics, constantly striving to enhance its value for businesses. He relishes the opportunity to solve complex challenges associated with achieving low latency and high throughput in analytics, ensuring seamless performance and insightful user experiences.
  • Talk 3: Supporting predictable latencies during data refresh and host restarts
    After pinot-server restarts or refreshes segments, Pinot experiences cold page-cache effects that lead to high tail latencies and query timeouts as data pages are not effectively pre-loaded into memory. We propose a page-cache warmup feature that issues selective "warmup" queries—sourced from an offline flow —during startup and refresh to preload pages with actual queries before serving traffic. Rollout at LinkedIn showed significant P95/P99 latency reductions and fewer timeouts with warmup enabled, improving SLO compliance during restarts and refreshes
    - Speaker Bio: Praveen is a Systems Software Engineer in the Data Infrastructure team at LinkedIn, working on Pinot. He plays a crucial role in enhancing the platform's scalability, efficiency, and privacy features to support LinkedIn's vast user base.

More Info on Pinot

Photo of Uber Engineering Events - San Francisco group
Uber Engineering Events - San Francisco
See more events