Apache Druid® and Apache Dolphin® :Orchestrating Batch Ingestion from Amazon S3


Details
This session will introduce Apache Druid and Apache DolphinScheduler. Apache DolphinScheduler is a distributed and extensible open-source workflow orchestration platform with powerful DAG visual interfaces. Apache Druid is a high performance, real-time analytics database that delivers sub-second queries on streaming and batch data at scale and under load.
We will look at DolphinScheduler’s tool integration solution with the machine learning ecosystem, scheduling of the current advanced real-time streams into the DAG tasks, workflow writing in high-level programming languages, and adaptation to cloud-native technologies.
Apache Druid's multi stage query engine (MSQE) will be reviewed along with some of the key use cases it enables for Druid. The session will end with a demo of how batch ingestion into Druid from s3 can be orchestrated using DolphinScheduler and the analytics that can be done on the data in Druid

Sponsors
Apache Druid® and Apache Dolphin® :Orchestrating Batch Ingestion from Amazon S3