Skip to content

Streaming Meetup: Streaming SQL, Kinesis at Lyft and Apache Calcite

Photo of Mark
Hosted By
Mark
Streaming Meetup: Streaming SQL, Kinesis at Lyft and Apache Calcite

Details

All,
Lyft helps organize three talks related to streaming.

Important Note: It is required to register for the event (free) on ti.to (https://ti.to/big-data/streaming-meetup-at-lyft/with/dulkmc11ypq) , before the event. You will then be sent an eNDA which needs to be signed 24 hours before the event, for security reasons. A badge would be pre-printed for you when you arrive at the event. Please register here.

Talk #1: Foundations of streaming SQL or: How I learned to love stream and table theory

This talk will address all of those questions in two parts.
First, we’ll explore the relationship between the Beam Model (as described in The Dataflow Model paper and the Streaming 101 and Streaming 102 blog posts) and stream & table theory (as popularized by Martin Kleppmann and Jay Kreps, etc., but essentially originating out of the database world).

Second, we’ll apply our clear understanding of that relationship towards explaining what is required to provide robust stream processing support in SQL. We’ll discuss concrete efforts that have been made in this area by the Apache Beam, Calcite, and Flink communities, compare to other offerings such as Kafka KSQL & Spark Structured streaming, and talk about new ideas yet to come.

In the end, you can expect to have a much better understanding of the key concepts underpinning data processing, regardless of whether that data processing batch or streaming, SQL or programmatic, as well as a concrete notion of what robust stream processing in SQL looks like.

Speaker: Tyler Akidau(Google)
Tyler Akidau is a staff software engineer at Google Seattle. He leads technical infrastructure’s internal data processing teams (MillWheel & Flume), is a founding member of the Apache Beam PMC, and has spent the last seven years working on massive-scale data processing systems. He is the author of the 2015 Dataflow Model paper and the Streaming 101 and Streaming 102 articles on the O’Reilly website. His preferred mode of transportation is by cargo bike, with his two young daughters in tow.

Talk #2: Kinesis at Lyft

This talk will focus on how we used Amazon Kinesis to build the pub-sub infra at Lyft, that ingests more than a 100 billion events per day. We'll review the strengths and weaknesses of Kinesis as a choice for streaming events in realtime, at Lyft's scale; as well as the best practices and lessons learnt over time.

Speaker: Hafiz Hamid (Lyft)

Hafiz Hamid is a software engineer on the Pub-Sub/Streaming Platform team at Lyft. He has built some of the key pieces in the messaging & streaming infrastructure at Lyft. Previously, Hafiz was a technical lead at Bing Search where he worked on data pipelines, relevance and web crawlers.

Talk #3: Data all over the place! How Apache Calcite brings SQL and sanity to streaming and spatial data

The revolution has happened. We are living the age of the deconstructed database. The modern enterprises are powered by data, and that data lives in many formats and locations, in-flight and at rest, but somewhat surprisingly, the lingua franca for remains SQL. In this talk, Julian describes Apache Calcite, a toolkit for relational algebra that powers many systems including Apache Beam, Flink and Hive. He discusses some areas of development in Calcite: streaming SQL, materialized views, enabling spatial query on vanilla databases, and what a mash-up of all three might look like.

Speaker: Julian Hyde (Looker)

Julian Hyde is an expert in query optimization and in-memory analytics. He founded Apache Calcite, a framework for query optimization and data virtualization. He also founded Mondrian, the popular open source OLAP engine, and co-founded SQLstream, an early streaming SQL platform. He is an architect at Looker.

Agenda:
6:00 - 6:30 pm: Check in and settle, networking
6:30 - 6:35 pm: Intros
6:35 - 7:10 pm - Talk #1 Tyler Akidau (Google)
7:15 - 7:50 pm - Talk #2 Hafiz Hamid (Lyft)
7:55 - 8:30 pm - Talk #3 Julian Hyde (Looker)
8:30 - 8:45 pm - Wrap up

Photo of SF Big Analytics group
SF Big Analytics
See more events
Lyft HQ
185 Berry Street · San Francisco, CA