Join us for an Apache Kafka meetup on April 11th from 6:30pm - 8:30pm, hosted by Uber in San Francisco. The address is 1455 Market Street. The agenda and speaker information can be found below. See you there!
6:30pm: Doors open
6:30pm - 6:45pm: Networking, Pizza and Drinks
6:45pm - 7:15pm: Presentation #1: Streaming data at Uber, Aditya Auradkar, Uber
7:15pm - 7:45pm: Presentation #2: A Streaming Platform Architecture based on Apache Kafka, Ewen Cheslack-Postava, Confluent
7:45pm - 8:15pm: Additional Q&A and Networking
Aditya manages the Streaming Data platform team at Uber. Powering pub-sub style event transport, streaming/batch analytics and ingestion are some examples of use-cases. Previously at LinkedIn, he managed the Apache Kafka engineering team and was one of the earliest members of the Espresso team (distributed storage)
Streaming data at Uber
Building data infrastructure is pretty hard! Building a multi-datacenter active-active real time data pipeline for multiple classes of data with different durability, latency and availability guarantees is even harder.
Streaming infrastructure powers critical pieces of Uber (think Surge) and in this talk we will discuss our architecture, technical challenges and how a blend of open source infrastructure and in-house technologies have helped Uber scale.
Ewen Cheslack-Postava is a Kafka committer and engineer at Confluent building a stream data platform based on Apache Kafka to help organizations reliably and robustly capture and leverage all their real-time data. He received his PhD from Stanford University where he developed Sirikata, an open source system for massive virtual environments. His dissertation defined a novel type of spatial query giving significantly improved visual fidelity, and described a system for efficiently processing these queries at scale.
A Streaming Platform Architecture based on Apache Kafka
What happens if you take everything that is happening in your company -- every click, every database change, every application log -- and make it all available as a real-time stream of well structure data? This session will explain how to combine the full Apache Kafka toolkit to accomplish this and shift from batch-oriented data integration and data processing to real-time streams and real-time processing.
We will explain how the design and implementation of Kafka enables it to act as a scalable platform for streams of event data. The Kafka Connect API is a tool for scalable, fault-tolerant data import and export and turns Kafka into a hub for all your real-time data and bridges the gap between real-time and batch systems. The Kafka Streams API is a new library built right into Kafka that provides the corresponding processing support. It is built leveraging Kafka's existing low-level clients, and provides a very low barrier to entry, easy operationalization, and a natural DSL for writing stream processing applications. These three components provide all the components you need for a data pipeline: storage, import/export, and processing.
Finally, we'll describe an architecture for a stream data platform that combines these tools to react to all your inbound streams of state. This architecture only requires tools that ship with Apache Kafka, is lightweight in terms of deployment and management, and yet can scale to support large organizations with massive data pipelines.
Special thanks to Uber (https://www.uber.com/) who are hosting us for this event.
Don't forget to join our Community Slack Team (https://slackpass.io/confluentcommunity)!
If you would like to speak or host our next event please let us know! [masked]
NOTE: We are unable to cater for any attendees under the age of 18. Please do not sign up for this event if you are under 18.