Data Infra Talks & Social: Uber, RisingWave, PingCAP, WarpStream!


Details
Hi Everyone! This meetup is in partnership with the San Francisco Stream Processing Group and the Cloud Native Silicon Valley user group. If you have already RSVPd there, you do not need to RSVP twice! :)
Agenda:
6.00-6.30: Networking & Snacks
6.30-7.30: Talks
- Richard Artoul (WarpStream) -- Beyond Tiered Storage: Zero Disk Architectures for Kafka
- Yingjun Wu (RisingWave) -- S3 as the state store for stream processing systems
- Yang Yang (Uber) and Zhifeng Chen (Uber) – Real-time message delivery with uForwarder
- Matthew Penaroza (PingCAP) -- How to choose a database: an overview of major databases
7.30-8.00: Q&A
Abstracts and Speaker Bios
Richard Artoul -- Beyond Tiered Storage: Zero Disk Architectures for Kafka
Abstract: In this talk, Richard Artoul will discuss some of the limitations of tiered storage in Apache Kafka. He'll then present a different approach, WarpStream's "zero disk" architecture which separates storage and compute entirely, with zero local disks, while still providing the full semantics of Kafka.
Bio: Richard Artoul is the co-founder and CEO of WarpStream, a drop-in replacement for Apache Kafka built directly on top of object storage with zero disks. Before WarpStream, he developed Datadog's columnar event store Husky.
Yingjun Wu - S3 as the state store for stream processing systems
Abstract: S3 is cheap; but S3 is slow. Stream processing systems want to maintain internal states at low cost but cannot tolerate high latency. How could we build a stream processing system on top of S3? I'll tell you what we learned over the last three years.
Bio: Yingjun Wu is the founder and CEO of RisingWave Labs. Before starting the company, Yingjun was a software engineer at the Redshift team, Amazon Web Services, and a researcher at the Database group, IBM Almaden Research Center. Yingjun received his PhD degree from National University of Singapore, and was a visiting PhD at Carnegie Mellon University. He has been working in the field of stream processing and database systems for over a decade.
Yang Yang and Zhifeng Chen – Real-time message delivery with uForwarder
Abstract: uForwarder is a proxy that transfers messages from Kafka to consumer services through RPC protocol, and aims to reliably deliver fresh data in real-time at scale.
Bio: Yang Yang is a Sr Staff Software Engineer at Uber, and she leads Uber's Streaming Data (Kafka) and Flink team. She focuses on building a highly scalable and reliable real-time data system at scale for thousands of engineers and data scientists at Uber.
Zhifeng Chen is a Sr Staff Engineer and Tech Lead Manager on Kafka Messaging Platform Team at Uber, also co-authors uForwarder.
Matthew Penaroza -- How to choose a database: an overview of major databases
Abstract: Large enterprises like LinkedIn, Databricks, and Pinterest are building fault-tolerant applications that can process hundreds of TBs of data for real-time data serving. This talk simplifies the database landscape from SQL and NoSQL to Distributed SQL, discussing how/why these enterprises picked TiDB to power their production workloads.
Bio: Before becoming a software engineer, Matthew Penaroza was a professional Dota 2 player. Now, he is a Senior Solutions Architect who works with large companies like AirBnb and LinkedIn to design their data infrastructure.

Data Infra Talks & Social: Uber, RisingWave, PingCAP, WarpStream!