For this Kafka Meetup, we welcome Ricardo De Cillo, Dmitriy Sorokin, Andrey Dyachkov, Max Schultze, and Daniel Truemper, with a series of short talks highlighting tales from the trenches. Topics will be focused on operating large, critical Kafka clusters in the cloud - and hope to foster discussion and share best practices for running Kafka without losing sleep.
*Please register on Eventbrite to guarantee entrance: Kafka Meetup Registration (https://www.eventbrite.com/e/kafka-meetup-at-the-shuttle-tickets-38624840939?aff=meetup)
17:30 - Doors Open - Drinks + Snacks
18:20 - Welcome and Intro
18:30 - Nakadi Event Broker - Lionel Montrieux
18:45 - Operating Kafka and Zookeeper on AWS - Ricardo De Cillo
19:00 - Bubuku - A Supervisor to Run Kafka on AWS - Dmitriy Sorokin
19:15 - Break - Drinks, Snacks, and Discussions
19:45 - Kafka and EBS - Andrey Dyachkov
20:00 - The Tricky Thing about Offsets - Max Schultze
20:15 - Kafka-powered Microservice - Daniel Truemper
20:30 - Networking + Drinks
21:45 - Event ends
For more details on topics and speakers, please read below:
Nakadi Event Broker
Speaker: Lionel Montrieux
Nakadi is an event broker, built on top of Kafka-like queues, that provides a REST API for easy integration with microservices. In this talk, we will briefly introduce Nakadi, and then focus on one of its most interesting features, Timelines.
Timelines allow administrators to transparently move topics between different clusters, without interrupting producers or consumers. This is useful to move between clusters, possibly using different technologies, but also as a fail-over for publishers in case of downtime on a cluster. Timelines can also be used to move event types between brokers on the same cluster, which is an alternative to rebalance operations. Contrary to rebalance operations, using timelines does not require additional data copy between brokers, which can slow them down. We also have plans to use timelines for re-partitioning, i.e. changing the number of partitions in an event type.
Operating Kafka and Zookeeper on AWS
Speaker: Ricardo De Cillo
Deploying and operating Kafka and Zookeeper reliably on AWS is a non trivial task. Most teams learn it the hard way. In this talk we are going to share our experience growing a Kafka cluster to ingest 5 TB of data per day with 99,999% availability.
You can expect to hear about: Deployment, Monitoring, Recovering from various failure scenarios, Scaling, Hardware choices, Important configurations
There is no single solution to this challenge, so let's talk. What choices did you make regarding operations and why?
Bubuku - A Supervisor to Run Kafka on AWS
Speaker: Dmitriy Sorokin
Operating a Kafka cluster on a cloud platform such as AWS comes with its challenges - finding exhibitor nodes, updating the configurations, reacting to lost instances without downtime, rebalancing data across brokers, and many others). Daily operations on large clusters, with thousands of topics, are not trivial, and can take up a lot of time.
To simplify operations, we have built Bubuku, a supervisor for Kafka. In this talk, we will explore some of Bubuku's most useful features, such as rebalances and migration of data, updates of Zookeeper, and rolling restarts of Kafka brokers. We will share our experience in running a large cluster in production using Bubuku, and discuss future plans for the project.
Kafka and EBS
Speaker: Andrey Dyachkov (github (http://github.com/adyach))
Kafka cluster is able to grow to huge amount of data stored on the disks. Hosting Kafka, requires support of instance termination (on purpose or just because `cloud provider` decided to terminate the instance), which in our case introduces a node with no data, the rebalance of the whole cluster has to accomplished in order to evenly distribute the data among the nodes, taking hours of data copying. In this talk, we will present how we avoid rebalance after node termination in Kafka cluster hosted on AWS.
The Tricky Thing about Offsets
Speaker: Max Schultze (twitter (https://twitter.com/mcs1408))
The Data Lake at Zalando is all about archiving data and making it accessible efficiently. For that, we consume a lot of data from Zalando's internal event bus Nakadi, which is based on Kafka. Additionally, we run another internal Kafka that data flows through first before it gets archived. This talk will be about presenting our current infrastructure, as well as talking about the concept you can mess up the most with during manual operations: offsets.
Speaker: Daniel Truemper (twitter (https://twitter.com/truemped), github (https://github.com/truemped))
Microservices have fueled an industry-wide transition toward massively distributed architectures with many more people working on the same systems. They also allow for greater individual ownership, freedom of deployment, and diverse technology choices.
In this distributed landscape, easily creating and maintaining consistency among various views of business-critical data, decoupling reads from writes, decoupling microservice go-lives, while adhering strictly to incrementality emerge as important topics.
This presentation will demonstrate how Kafka can function as a backbone for the microservice world thanks to log compaction, which makes it unique compared to other event brokers. We’ll show how log compaction is essential in delivering a consistent user experience, and how this can empower teams developing their own autonomy. We will walk through a migration path from a hypothetical monolithic shop to a set of microservices.