Skip to content

Operationalizing Kafka in the cloud, Benchmarks and SpotHero's uses

Photo of Prem Nallasivampillai
Hosted By
Prem N.
Operationalizing Kafka in the cloud, Benchmarks and SpotHero's uses

Details

Session abstract:

Kafka is rapidly becoming the go-to tool for solving various complex data problems. Kafka’s flexibility, resiliency, fault-tolerance, high performance, vibrant community, ecosystem of complementary tools, and being open source software, licensed under ASL 2.0, has allowed organizations to adopt it for use cases such as being the core of their data pipeline, ETL, real-time streaming data platform, and database to name a few. The high rate of adoption of such a young technology by organizations large and small to standardize on Kafka as the heart of their data platform is a testament to the characteristics exhibited by Apache Kafka.

This session will provide information on how to deploy, operate and manage 100s of independent Apache Kafka clusters in public clouds. We share our experiences and approaches for maintaining high uptimes, live up/down scaling, rolling software updates for patches and/or major versions for our deployments. We run Kafka clusters across six different cloud providers and total of 80 geographical regions. In this session, we will touch architecture, operations and monitoring.

Also included in this session is a talk by SpotHero engineering on how Kafka is solving its business needs.

Benchmarking Apache Kafka Performance: Read & Write Throughput

The Kafka benchmark includes an estimated mix of read- and write- output rates for various Apache Kafka setups (plan tiers - 3/6/9 brokers; varying CPU/RAM for 3/6/9 brokers; etc) in different public (AWS, GCP, Azure, DigitalOcean, etc.) clouds. (Here’s the link to the first benchmark.) The benchmark takes into account a typical customer message sizes and uses standard open source tools for producing load. The load generators used separate systems over the public internet to make sure the load mimicked actual customer workloads as closely as possible. Note however that, using VPC peering (or, local network) would see notable increases in performance (both throughput and latency).

Speaker Bio for Heikki Nousiainen, Co-founder & CTO of Aiven
Heikki is the CTO and co-founder of Aiven. Prior to co-founding Aiven, Heikki was held Software Architect role at F-Secure, pushing for cloud transformation and improved development productivity with DevOps. Heikki has a background in Software Engineering, but has also worked as a Information Security Specialist at TietoEnator, consulting and performing information security assessments. Heikki loves software and Open Source, and regularly presents in Open Source Events. Heikki also runs the local Apache Kafka Meetup in his hometown, Helsinki.

Co-sponsored with Aiven. Food and beverages provided.

Photo of Chicago Area Kafka Enthusiasts group
Chicago Area Kafka Enthusiasts
See more events
125 S Clark St
125 S Clark St · Chicago, IL