Skip to content

Distributed, high-throughput, persistent real-time messaging with Apache Kafka

Photo of Matthew Rathbone
Hosted By
Matthew R.
Distributed, high-throughput, persistent real-time messaging with Apache Kafka

Details

Pizza and soda for the meetup is being provided by LinkedIn (http://linkedin.com/), the company originally responsible for building Kafka. Thanks @jaykreps (https://twitter.com/jaykreps) for the generosity.

Apache Kafka (http://kafka.apache.org/) is (from the project page):

... a distributed publish-subscribe messaging system. It is designed to support the following

Persistent messaging with O(1) disk structures that provide constant time performance even with many TB of stored messages. High-throughput: even with very modest hardware Kafka can support hundreds of thousands of messages per second. Explicit support for partitioning messages over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics. Support for parallel data load into Hadoop.

Talk Outline:

The design of Apache Kafka Some hand-wavy diagrams of the linked-in and foursquare topologies Why I really like Kafka Kafka vs other pub/sub and queue systems (like redis, rabbit) Kafka vs other log persistence systems (like flume / scribe) Takeaways from the production deployment at foursquare

Photo of Big Data Madison group
Big Data Madison
See more events