Skip to content

Minion Mastery Meetup: Navigating Pinot's Analytics Realm

Photo of StarTree
Hosted By
StarTree
Minion Mastery Meetup: Navigating Pinot's Analytics Realm

Details

Dive into Pinot's Analytics Realm with Apache Pinot with Haitao Zhang!

Join us for an exclusive online session as we unlock the secrets of Apache Pinot's analytics prowess. Haitao Zhang, a distinguished Apache Pinot member, and StarTree All-Star, will guide us through a journey of mastering Minions and navigating the realms of advanced analytics.

Minions to The Rescue - Tackling Complex Operations in Apache Pinot

Apache Pinot is a real-time distributed OLAP datastore that powers a variety of analytics use cases, which usually require executing high-throughput queries with low latency. To ensure data completeness, result correctness, and system performance, Pinot needs to execute background operational tasks - e.g. data compaction, GDPR data purging and reindexing after schema evolution etc. However, these operations can be computationally intensive and can easily impact query performance if executed on the same component as query execution.

Pinot leverages Minion, an Pinot native component built upon Apache Helix’s task framework, to execute those computationally intensive operational tasks, thus offloading workloads from the query execution component and avoiding sacrificing the query performance. The Minion component is designed to be easily extensible and pluggable – in addition to addressing the above issue, Minion is also used to build common data ingestion and backfilling pipelines, saving operators time from building customized and ad-hoc ones.

In this talk, we will deep dive into the Minion component and demonstrate how we leverage it in some typical operations tasks. We will also discuss the challenges faced while operating Minion at scale and how we greatly reduced the operational overheads by improving observability and introducing auto-scaling mechanisms.

To summarize, on one hand, Minion takes most of the operational burden in Pinot, helping real-time analytics run smoothly; on the other hand, Minion gives operators flexibility to perform complex operations that were hard (or even impossible) to perform, providing more delightful analytics product experiences.

For frequent content updates, join the community on Slack here>

Photo of Real-Time Analytics with Apache Pinot™ by StarTree group
Real-Time Analytics with Apache Pinot™ by StarTree
See more events