A Deep-dive into Structured Streaming


Details
Hello Spark Enthusiastic's!
We would like to invite you to join us on Marionete (http://www.marionete.co.uk) Offices where we will have Tathagata Das (https://www.linkedin.com/in/tathadas), Databricks (https://databricks.com/) Professional talking in live video. Please create an account here (https://community.cloud.databricks.com/) and bring your own laptop!
6:30-6:45pm: Mingling
6:45-7pm: Welcome
7-7:40pm: Apache Spark 2.0: Structured Streaming with Tathagata Das (https://www.linkedin.com/in/tathadas)
7:45-8:30pm: Databricks (https://databricks.com/) Community Edition Workshop with Carlos (https://pt.linkedin.com/in/carlosrodrigues5) and Raul (https://pt.linkedin.com/in/rauferreira/en)
8:30-8:40pm: Mingling
Abstract: A Deep-dive into Structured Streaming
In Apache Spark (http://spark.apache.org/) 2.0, we have extended DataFrames and Datasets in Spark to handle streaming data. Streaming Datasets not only provides a single programming abstraction for batch and streaming data, it brings support for event-time based processing, out-or-order/delayed data, sessionization and tight integration with non-streaming data sources and sinks. In this talk, Tathagata will take a deep dive into the concepts and the API and show how this simplifies building complex “continuous applications."
Bio:
Tathagata Das (https://www.linkedin.com/in/tathadas) is an Apache Spark Committer and a member of the PMC. He’s the lead developer behind Spark Streaming, and is currently employed at Databricks. Before Databricks, you could find him at the AMPLab of UC Berkeley, researching about datacenter frameworks and networks with professors Scott Shenker and Ion Stoica.

A Deep-dive into Structured Streaming