Skip to content

Details

Something special this time! Our speaker will be Paco Nathan, Director of Community Evangelism at Databricks. Paco will be telling as all about 'Shiny New Bits in Spark Streaming'.

Abstract

To paraphrase the immortal crooner Don Ho: "Tiny Batches, in the wine, make me happy, make me feel fine."

http://youtu.be/mlCiDEXuxxA

Apache Spark provides support for streaming use cases, such as real-time analytics on log files, by leveraging a model called discretized streams (D-Streams). These "micro batch" computations operated on small time intervals, generally from 500 milliseconds up. One major innovation of Spark Streaming is that it leverages a unified engine. In other words, the same business logic can be used across multiple uses cases: streaming, but also interactive, iterative, machine learning, etc.

This talk will compare case studies for production deployments of Spark Streaming, emerging design patterns for integration with popular complementary OSS frameworks, plus some of the more advanced features such as approximation algorithms, and take a look at what's ahead — including the new Python support for Spark Streaming that will be in the upcoming 1.2 release.

Also, let's chat a bit about the new Databricks + O'Reilly developer certification for Apache Spark…

Sponsors

Sponsor logo
Evolution AI
The organiser, Evolution AI, is an award-winning data extraction firm.
Sponsor logo
Man Group
In-person Meetup venue host.
Sponsor logo
G-Research
In-person Meetup venue host.
Sponsor logo
ArcticDB
In-person Meetup sponsor.
Sponsor logo
Capgemini
IT services global leader.

Members are also interested in