Skip to content

Building a Data Pipeline for Ad Tech using Spark, Impala, and Zoomdata

A
Hosted By
Avinash R. and Shekhar V.
Building a Data Pipeline for Ad Tech using Spark, Impala, and Zoomdata

Details

Topic: Building a Data Pipeline for Ad Tech using Spark, Impala, and Zoomdata to support 1 Billion auctions a day and beyond

Abstract: Ad Tech is a space with a perfect use case for Big Data Tools and Solutions. Many Ad Tech companies ingest millions or billions of records per day and want the ability to report and query on that data in a near real time. We recently worked with an Ad Tech company and Zoomdata to implement a solution using Spark, Impala, and Zoomdata. Our intent was to build a platform with the capability to support ingestion of billions of auctions per day, report and query at the raw transactional level, and do this all in in near real time.

In this presentation, we will cover: an overview of the space, the project, and the use cases; the pre and post architectures; the benefits over their prior solution; and lastly the lessons that we learned.

Presenter: Martin Gragg

BIO: Martin Gragg is a Principal Consultant at Clairvoyant LLC. His current focus is building data pipelines using Big Data technologies. He has been building, supporting, and administering enterprise database solutions for over 15 years at companies like Apollo Education Group and Motorola using technologies like Oracle and more recently MongoDB and Hadoop technologies.

PS: Please note the change of topic. Confluent team cannot make it for the session on August 3rd. We will reschedule the Kafka Streams to a future date.

Photo of Data Analytics PHX group
Data Analytics PHX
See more events
The University of Advancing Technology
2625 W. BASELINE RD. · Tempe, AZ