Benchmarking Apache Druid II: What’s Under the Hood?
Details
Note: this meetup will be recorded, but please RSVP so you can receive the link to the recording even if you can't attend the live meetup!
Apache Druid is a high performance real-time analytics database capable of millisecond query response times over billions of events. We previously discussed a methodology for testing Druid using the Star Schema Benchmark (SSB) to characterize performance in a data warehouse scenario. This talk is for everyone who needs to know what makes Druid capable of serving up low-latency analytics at scale. Join us to learn about Druid’s architecture, and how it was designed and built to provide the speed and scale needed to power cloud-native applications. We’ll dive into components with a focus on the query processing and storage layers in order to understand how they drive Druid’s performance. Learn how Druid provides sub-second response times over billions of rows and why performance scales predictably with data growth.
Speaker Bio:
Matt Sarrel is an Apache Druid Evangelist at Imply, where he serves as a technical and educational resource for developers building high-performance real-time analytics apps. Matt has evangelized and built products at dozens of tech companies large and small, as well as for open source projects Apache Ignite, Apache Mesos, and now Apache Druid. He has used, contributed to, and advocated for open-source software throughout his career, starting way back in the late-90s when the now-gone Linux Router Project proved indispensable during a hardware failure. Matt frequently writes and speaks about topics such as cloud-native infrastructure and application development, data analytics, and security. He holds an MPH in epidemiology from Columbia and a BA in history from Cornell, and is a certified information systems security professional (CISSP).

