Mike Keane - Integrating Flume and Kafka to Process >100B Entries per Day
Details
In this presentation, I will give an overview of Conversant’s data collection pipeline, detailing how using Kafka as a buffer quickly solved our problems fluming to a Hadoop cluster under heavy utilization. Additionally, I will describe development needed to customize open source tools to meet the business needs and how to quickly build a simple data pipeline for your own evaluation. Finally, I will touch on Kafka’s role in what is next in our development roadmap.
