Streams, Tables, and Time in KSQL


Details
Join us for an Apache Kafka meetup on January 10th from 5pm in the Humboldt-Universität zu Berlin . The address, agenda and speaker information can be found below. See you there!
-----
Agenda:
5:00pm: Doors open
5:00pm - 5:15pm: Snacks, Drinks and Networking
5:15pm - 6:00pm: Matthias J Sax, Confluent
6:00pm - 6:45pm: Jakob van Santen, DESY
6:45pm - 7:00pm - Additional Q&A & Networking
-----
Speaker:
Matthias J Sax
Bio:
Matthias is a Kafka committer and software engineer at Confluent working on Kafka’s Stream API. Prior to Confluent, he was a PhD student at Humboldt-University of Berlin, conducting research on the data stream processing system. Matthias is also a committer at Apache Flink and Apache Storm.
Title:
Streams, Tables, and Time in KSQL
Abstract:
KSQL is the Streaming SQL engine for Apache Kafka that allows for continuous data stream processing. While KSQL looks very similar to SQL, it provides quite different semantics. First, KSQL queries can be defined over data streams, not just tables. Second, queries over tables are no snapshot queries, but run forever. And third, time is a core concept in KSQL and data stream processing in general. In this talk, we explore the nature of Streaming SQL and its temporal semantics that apply to both streams and tables. We will explain continuous queries semantics, the relationship between streams and tables, and demystify the temporal nature of KSQL tables. Furthermore, we dig into filter, aggregation, and join operations over stream and tables as well as stream specific operators like windowing. At the end, you will be equipped to query streams and tables using KSQL and understand their temporal relationship to each other.
---
Speaker:
Dr. Jakob van Santen, DESY Zeuthen, Germany
Bio:
Jakob van Santen has been a postdoc at DESY in Zeuthen since 2015. He
works on simulation, event reconstruction, data analysis, and workflow
management for the IceCube Neutrino Observatory as well as real-time
data analysis with wide-field optical telescopes. He previously received
his Ph.D. from the University of Wisconsin-Madison in 2014, where he
measured the spectrum of high-energy neutrinos interacting in the the
IceCube detector.
Title:
Data pipelines for transient astronomy - Automation, reproducibility, and provenance in the oldest science
Abstract:
Observational astronomy has expanded over the last decade to include two major new varieties: time-domain and multi-messenger astronomy. In time domain astronomy, we observe the beginning and aftermath of explosions and disruptions of stars on time scales from seconds to months. In multi-messenger astronomy, we combine observations with photons, neutrinos, and gravitational waves into a more complete picture of the most violent processes in the universe. Fully exploiting these techniques requires different approaches to data distribution and analysis than are typical in astronomy.
In this talk I will review the kinds of problems that arise in real-time astronomy, and illustrate how they are addressed in the online analysis systems developed for the IceCube Neutrino Observatory at the South Pole and the Zwicky Transient Facility on Mount Palomar
-----
Don't forget to join our Community Slack Team! https://launchpass.com/confluentcommunity
If you would like to speak or host our next event please let us know! community@confluent.io
NOTE: We are unable to cater for any attendees under the age of 18. Please do not sign up for this event if you are under 18.


Streams, Tables, and Time in KSQL