Distributed, high-throughput, persistent real-time messaging with Apache Kafka

Name: Distributed, high-throughput, persistent real-time messaging with Apache Kafka
Start: 2013-02-26T18:00:00-06:00
End: 2013-02-26T21:00:00-06:00
Location: Bendyworks

Hosted By

Matthew R.

Distributed, high-throughput, persistent real-time messaging with Apache Kafka

Details

Pizza and soda for the meetup is being provided by LinkedIn (http://linkedin.com/), the company originally responsible for building Kafka. Thanks @jaykreps (https://twitter.com/jaykreps) for the generosity.

Apache Kafka (http://kafka.apache.org/) is (from the project page):

... a distributed publish-subscribe messaging system. It is designed to support the following

Persistent messaging with O(1) disk structures that provide constant time performance even with many TB of stored messages. High-throughput: even with very modest hardware Kafka can support hundreds of thousands of messages per second. Explicit support for partitioning messages over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics. Support for parallel data load into Hadoop.

Talk Outline:

The design of Apache Kafka Some hand-wavy diagrams of the linked-in and foursquare topologies Why I really like Kafka Kafka vs other pub/sub and queue systems (like redis, rabbit) Kafka vs other log persistence systems (like flume / scribe) Takeaways from the production deployment at foursquare

Events in Madison, WI