Preview of Spark Streaming

  • June 20, 2012 · 6:15 PM

This meetup will feature the first preview of Spark Streaming, the extension to the Spark cluster computing framework that supports near-real-time stream processing. Spark Streaming is under active development at Berkeley with help from Conviva, and will likely be released as an alpha later this summer. When finished, it will let users combine streaming, batch and interactive queries behind the same rich API and fast, in-memory computing engine.

In addition, there will be an overview of improvements to the Spark engine currently in the "dev" branch, and future development plans. We also plan to solicit feedback from users on which features they want us to prioritize.

The meetup will be hosted at Yelp in San Francisco. Food will be provided. Doors open at 6:15, with talks starting at 7 PM.

Important: Please register by Monday June 18th, with both your first and last names. The organizers need to have a list of attendees in advance. (If you'd prefer not to list your real name online, you can email [masked]).

 

More about Spark Streaming

Spark streaming lets users run fault-tolerant continuous queries with 1-2 second latency on large data streams, using a rich functional interface similar to Spark, where users can map, filter, join, and reduce streams (among other operations) using functions in the Scala programming language. The system automatically distributes the work across machines and recovers from failures and stragglers, even for operators with state, such as a reduce over a sliding window. In addition, users can combine streams with historical data computed through batch jobs, or run ad-hoc queries on stream state from the Scala interpreter, providing a powerful realtime analytics environment. While Spark Streaming is still in development, early results show that it performs similarly, and often significantly better, than current open source stream processing frameworks, while offering a richer programming model and stronger fault tolerance guarantees. A short paper on the system is available at http://www.cs.berkeley.edu/~matei/papers/2012/hotcloud_spark_streaming.pdf.

 

The project will be presented by Tathagata Das, Haoyuan Li and Matei Zaharia, the team behind the research effort.

Join or login to comment.

  • Suhas K.

    Hello, Is there a beta version available that I could try out.

    December 15, 2012

  • Joakim S.

    Do you plan TCP support for Spark streaming?

    September 10, 2012

  • Rich M.

    Spark Streaming was even more impressive than the demo of Spark and Shark at the Hadoop Summit. Loved it.

    June 21, 2012

  • Ryan H.

    Nice presentation and interesting people to chat with

    June 21, 2012

  • Matei Z.

    Thanks everyone for coming by! I've uploaded today's slides at http://files.meetup.com/3138542...­.

    June 20, 2012

  • A former member
    A former member

    No, I believe this one is focused on the streaming portion which is work in progress at this time. Summit talk covered Spark and Shark mostly and just briefly mentioned streaming...

    June 16, 2012

  • A former member
    A former member

    Is this similar to hadoop summit talk?

    June 16, 2012

Our Sponsors

  • Databricks

    video streaming / recording

  • O'Reilly Media

    Conference coupons, new ebooks/videos samplers; new reports, etc.

  • Cloudera

    Kindly providing food & drink!

People in this
Meetup are also in:

Sometimes the best Meetup Group is the one you start

Get started Learn more
Rafaël

We just grab a coffee and speak French. Some people have been coming every week for months... it creates a kind of warmth to the group.

Rafaël, started French Conversation Group

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy