Hosted at Confluent HQ: Exactly Once, Kubernetes and CDC with Pinterest

This is a past event

128 people went

Confluent

444 High St #100 · Palo Alto, CA

How to find us

ALL ATTENDEES MUST REGISTER AT LEAST 24 HOURS PRIOR TO THE EVENT ON THIS LINK: https://forms.gle/wag6AJnQKzfbyiyG9 (All information provided here will be used for the sole purpose of security)

Location image of event venue

Details

**TO ATTEND**

1) RSVP below
2) Fill in this short form: https://forms.gle/wag6AJnQKzfbyiyG9
3) Prior to the event, you will receive an email asking you to register for the event and sign an NDA. Complete these steps.

4) Join us for an Apache Kafka® meetup on July 16 at 6:30pm, hosted by Confluent in Palo Alto! Address, agenda and speaker information can be found below. See you there!

6:30pm-7pm: Entry (networking/pizza)
7pm - 7:40pm: Rajesh RC, Confluent
7:40pm - 8:20pm: Jason Gustafson/Boyang Chen, Confluent
8:20pm - 9pm: Liquan Pei, Pinterest
9pm - 9:30pm: More Q&A/Networking

--

Talks:

Speaker: Rajesh RC is a Software Engineer at Confluent, where he works in the Control Plane team. He builds distributed systems using Kubernetes to manage the lifecycle of Confluent Platform across cloud providers and on-premise data centers. Previously, he worked with American Express to build PAAS platform on top of OpenShift.

Talk: Lifecycle Management Of Confluent Platform on Kubernetes using Operator

The Confluent Operator helps to minimize the operational cost of running Confluent Platform, an Event Streaming platform, on Kubernetes so customers can rightly focus on business-critical tasks. Kubernetes is becoming the de-facto container-centric management environment to deploy and manage containerized workloads across private and public clouds. The Confluent Operator leverages Kubernetes extensions by providing custom resources and custom controllers to deploy and manage Confluent platform. Confluent Operator provides automated provisioning, upgrades, security setup, monitoring, resiliency, and operations management.

In this technical deep dive, we will go through lifecycle management of Confluent Platform in Kubernetes ecosystem.

---

Speakers: Jason Gustafson is a committer on Apache Kafka and a member of the PMC. He has made numerous contributions including support for exactly once semantics and core improvements to the replication protocol.

Talk: Exactly Once Semantics Revisited

Two years ago, we helped to contribute a framework for exactly once semantics (or EOS) to Apache Kafka. This much-needed feature brought transactional guarantees to stream processing engines such as Kafka Streams. In this talk, we will reflect as usage has gradually picked up steam. What did we get right and what did we get wrong? Most importantly, we will discuss how the work is continuing to evolve in order to provide more reliability and better performance. This talk assumes basic familiarity with Kafka and the log abstraction. What you will get out of it is a deeper understanding of the underlying architecture of the EOS framework in Kafka, what its limitations are, and how you can use it to solve problems.

--

Speaker: Liquan Pei is currently the tech lead in the Ads Infrastructure Team at Pinterest. He is primarily working on building a realtime streaming platform for ads budgeting optimization for over 250M monthly active users at Pinterest. He is also an open source contributor to Apache Kafka and Spark.

Talk: Change Data Capture in Distributed Databases

This will be a deep dive into Change Data Capture (CDC) in TiDB and CockroachDB, two NewSQL, transactional and distributed databases with high adoption. CDC is an important feature in databases as it is the basis for database replication, backup and event driven computation. Putting the database changes into Kafka is also a common practice nowadays to power many use cases.

CDC in distributed databases faces many unique challenges: 1. Global order based on transaction IDs for transactions executed across multiple machines 2. Low latency to support database replication and event driven processing 3. Scalable to handle increasing number of transactions per second. 4. Highly availalbe and tolerant to failures. In this talk, we will focus on how the systems are designed and built to handle those challenges. Also, we will highlight how Kafka is used in the CDC for TiDB and CockroachDB.