Skip to content

March Kafka Meetup

Photo of Neha Narkhede
Hosted By
Neha N.
March Kafka Meetup

Details

Join us for the March Apache Kafka Meetup at San Jose Convention Center on Tuesday, March 29 from 6pm-8pm. Presentations to be announced.

See you there!

Agenda

6:00pm: Doors Open

6:00pm - 6:30pm: Networking

6:30pm - 7:30pm: Presentations (See Below)

7:30pm - 8:00pm: Q & A and Networking

First Talk

Speaker: Guozhang Wang, Confluent

Bio: Guozhang is a an engineer at Confluent, building a stream data platform on top of Apache Kafka. He receives his PhD from Cornell University database group where he worked on scaling iterative data-driven applications. Prior to Confluent, Guozhang was a senior software engineer at LinkedIn, developing and maintaining its backbone streaming infrastructure on Apache Kafka and Apache Samza.

Title: Introduction to Kafka Streams

Abstract: In the past few years Apache Kafka has emerged itself as the world's most popular real-time data streaming platform backbone. In this talk, we introduce Kafka Streams, the latest addition to the Apache Kafka project, which is a new stream processing library natively integrated with Kafka.

Kafka Streams has a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. As such it is the most convenient yet scalable option to analyze, transform, or otherwise process data that is backed by Kafka. We will provide the audience with an overview of Kafka Streams including its design and API, typical use cases, code examples, and an outlook of its upcoming roadmap. We will also compare Kafka Streams' light-weight library approach with heavier, framework-based tools such as Spark Streaming or Storm, which require you to understand and operate a whole different infrastructure for processing real-time data in Kafka.

Second Talk

Speaker: Xavier Léauté, Metamarkets

Bio: Xavier heads the backend engineering team at Metamarkets and is one of the main contributors to Druid. In a prior life, he held quantitative research positions at BlackRock, Barclays Global Investors, and MSCI. He holds a Masters in Engineering from École Centrale Paris and MEng in Operations Research from Cornell University.

Title: Streaming Analytics at 300 billion events/day with Kafka, Samza, and Druid

Abstract: Wonder what it takes to scale Kafka, Samza, and Druid to handle complex analytics workloads at petabyte size? We will share a high level overview of the Metamarkets realtime stack, the lessons learned scaling our real-time processing to over 3 million events per second, and how we leverage extensive metric collection to handle heterogeneous processing workloads, while keeping down operational complexity and cost. Built entirely on open source, our stack performs streaming joins using Kafka and Samza, feeding into Druid to serve 1 million interactive queries per day.

Photo of Bay Area Apache Kafka® Meetup group
Bay Area Apache Kafka® Meetup
See more events