Skip to content

Details

Hello Spark Enthusiastic's!

We would like to invite you to join us on Marionete (http://www.marionete.co.uk) Offices where we will have Tathagata Das (https://www.linkedin.com/in/tathadas), Databricks (https://databricks.com/) Professional talking in live video. Please create an account here (https://community.cloud.databricks.com/) and bring your own laptop!

6:30-6:45pm: Mingling

6:45-7pm: Welcome

7-7:40pm: Apache Spark 2.0: Structured Streaming with Tathagata Das (https://www.linkedin.com/in/tathadas)

7:45-8:30pm: Databricks (https://databricks.com/) Community Edition Workshop with Carlos (https://pt.linkedin.com/in/carlosrodrigues5) and Raul (https://pt.linkedin.com/in/rauferreira/en)

8:30-8:40pm: Mingling

Abstract: A Deep-dive into Structured Streaming

In Apache Spark (http://spark.apache.org/) 2.0, we have extended DataFrames and Datasets in Spark to handle streaming data. Streaming Datasets not only provides a single programming abstraction for batch and streaming data, it brings support for event-time based processing, out-or-order/delayed data, sessionization and tight integration with non-streaming data sources and sinks. In this talk, Tathagata will take a deep dive into the concepts and the API and show how this simplifies building complex “continuous applications."

Bio:

Tathagata Das (https://www.linkedin.com/in/tathadas) is an Apache Spark Committer and a member of the PMC. He’s the lead developer behind Spark Streaming, and is currently employed at Databricks. Before Databricks, you could find him at the AMPLab of UC Berkeley, researching about datacenter frameworks and networks with professors Scott Shenker and Ion Stoica.

Members are also interested in