Skip to content

Getting Started with Structured Streaming

Photo of Craig Warman
Hosted By
Craig W.
Getting Started with Structured Streaming

Details

Structured Streaming was introduced in Spark 2.0 as a streaming component to Spark SQL’s very popular Dataframe API. Streaming problems are challenging in nature; despite interest in exploring streaming applications, many Spark users experience slow adoption caused by a steep learning curve. The purpose of this talk is to provide a guided introduction to structured streaming -- from understanding API internals to providing business context and real examples to help you get started.

During this talk you will learn:

• Pain points of Spark Streaming with DStreams

• Structured Streaming programming model

• API features and best practices

• Advantages of structured streaming versus other streaming engines

• Common public datasets for testing structured streaming

• How to get started, including a live demo with published code

Speaker Bio
Myles Baker is a Solutions Architect who helps large enterprises develop Apache Spark applications using Databricks. He specializes in streaming and machine learning. His work on image processing software at NASA introduced him to distributed computing, and since then he has helped clients build data science models and applications at-scale spanning multiple industries. He received a B.S. in Applied Mathematics from Baylor University and an M.S. in Computer Science from the College of William and Mary.

Photo of Atlanta Apache Spark User Group group
Atlanta Apache Spark User Group
See more events