Powerful Ad hoc, Real-time Analytics on Large Datasets using Apache Druid

Are you going?

73 people going

Location image of event venue


We have engineers from Imply visiting us to present this month's meetup. The meetup is being hosted at Improving.

Providing users with self-service analytics on real-time data in large, multi-petabyte datasets is difficult. In an ideal world, you would give users access to the entire database with every dimension at their fingertips. However, this is often not technically feasible or leads to performance that is far too slow to be useful.

Enter Apache Druid. Druid is an open source analytics database powering real-time, ad hoc analytics. It is used for clickstream analytics, network telemetry, fraud detection, application monitoring and more by companies like Apple, Netflix, Twitter, and AirBnb. It can ingest millions of records per second and deliver sub-second response times to OLAP-style slice and dice queries performed on extremely large datasets.

In this presentation, we will start with the fundamentals of Druid before taking a deep dive into its inner workings. We will use Druid and the Imply Pivot data visualization UI, which is optimized for Druid, to explore data at scale.

Speaker Bio:
Mike McLaughlin is a senior field engineer at Imply Data. He helps customers run and optimize Apache Druid in production. He has 20 years of engineering experience in data-oriented software.

6:00 pm - 6:30 pm: Networking over pizza and drinks
6:30 pm - 6:35 pm: Welcome and speaker intro
6:35 pm - 7:30 pm: Presentation
7:30 pm - 8:00 pm: Q&A, discussion and more networking

With the holiday season approaching, this will be the last meetup for this year. We will come back in January 2020 with more great presentations and learning opportunities for everyone.