Skip to content

Building GTFS-Realtime data pipeline/Tune RocksDB for Kafka Streams State Stores

Photo of Alice Richardson
Hosted By
Alice R.
Building GTFS-Realtime data pipeline/Tune RocksDB for Kafka Streams State Stores

Details

Hello Kafkateers!

In order to do our part to help flatten the curve of the spread of COVID-19, we are moving all of our meetups online for the time being. Please find the details to join this fun and informative meetup below.

Find information about upcoming meetups and tons of content from past Kafka Meetups all over the world:
cnfl.io/meetup-hub

-----
Agenda:
6:00pm-6:10pm: Online Networking (feel free to BYOB!!)

6:10pm-6:55pm: Building a reliable GTFS-Realtime data pipeline to support Mobimeo’s routing capabilities, Sergey Burkov, R&D Engineering Lead, Mobimeo

6:55pm-7:25pm: Performance Tuning RocksDB for Kafka Streams’ State Stores, Dhruba Borthakur, CTO & Co-founder Rockset and Bruno Cadonna, Contributor to Apache Kafka & Software Developer at Confluent

7:25pm-7:30pm: Any additional Q&A

Joining our slack space is not instant, so ensure that you are in, in time for the event, follow the steps within this link before the day of the event if you can! https://launchpass.com/confluentcommunity

Speaker:
Sergey Burkov, R&D Engineering Lead, Mobimeo

Title:
Building a reliable GTFS-Realtime data pipeline to support Mobimeo’s routing capabilities

Abstract:
Routing capabilities are an essential part of any navigation app. When a user searches a route from A to B, they expect to get very accurate and relevant route suggestions.

Routing is a very complex engineering challenge where data plays a critical role. It's freshness, accuracy, and coverage are key success factors for finding an optimal route.

Routers that operate on stale data are useless as in reality busses are stuck in traffic jams, trains can get cancelled due to union strikes, and normal operating schedules are often changed due to planned construction work or completely unplanned challenges. In this talk, Sergey will discuss how Apache Kafka® and Confluent stack are playing an essential role in a very complex and robust real-time data pipeline which fuels Mobimeo’s routing capabilities.

-----
Speakers:
Dhruba Borthakur, CTO & Co-founder Rockset and Bruno Cadonna, Contributor to Apache Kafka & Software Developer at Confluent

Title: Performance Tuning RocksDB for Kafka Streams’ State Stores

Abstract:
RocksDB is the default state store for Kafka Streams. In this talk, we will discuss how to improve single node performance of the state store by tuning RocksDB and how to efficiently identify issues in the setup. We start with a short description of the RocksDB architecture. We discuss how Kafka Streams restores the state stores from Kafka by leveraging RocksDB features for bulk loading of data. We give examples of hand-tuning the RocksDB state stores based on Kafka Streams metrics and RocksDB’s metrics. At the end, we dive into a few RocksDB command line utilities that allow you to debug your setup and dump data from a state store. We illustrate the usage of the utilities with a few real-life use cases. The key takeaway from the session is the ability to understand the internal details of the default state store in Kafka Streams so that engineers can fine-tune their performance for different varieties of workloads and operate the state stores in a more robust manner.

-----
Online Meetup Etiquette:
•Please unmute yourself when you have a question.
•Please hold your questions until the end of the presentation or use the zoomchat!
•Please arrive on time as zoom meetings can become locked for many reasons (though if you get locked out a recording will be available, but you may have to wait a little while for it!)

----
If you would like to speak or host our next event please let us know! community@confluent.io

Photo of Berlin Apache Kafka® Meetup by Confluent group
Berlin Apache Kafka® Meetup by Confluent
See more events