addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrosseditemptyheartexportfacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Apache Spark - Easier and Faster Big Data + Collaborative Filtering

TALK #1 - Apache Spark - Easier and Faster Big Data, by Reynold Xin (DataBricks)

ABSTRACT : Dubbed the leading successor to Hadoop MapReduce, Apache Spark is a cluster compute system that makes data analytics fast -- both fast to run and fast to write. Programs written in Spark can often outperform those in MapReduce by 100X, while being 10X shorter and more understandable. In addition, Spark also provides efficient support for streaming, query execution, machine learning, and graph computation through rich high level libraries. Last but not least, the project features one of the most active open source community in Big Data: 170+ developers from 30+ organizations have contributed code to the project. In this talk, we will introduce the project, survey the high level libraries including streaming, SQL, and machine learning, and expand into how Spark can help you make better decisions easier and faster.

BIO : Reynold Xin is a committer on Apache Spark and a co-founder of Databricks. He is instrumental in the development of many high level frameworks on Spark, including SQL and graph computation. Prior to Databricks, he was pursuing a PhD in the UC Berkeley AMPLab.  

TALK #2 - Collaborative Filtering with Spark, by Christopher Johnson (Spotify)

ABSTRACT : Spotify uses a range of Machine Learning models to power its music recommendation features including the Discover page and Radio. Due to the iterative nature of training these models they suffer from IO overhead of Hadoop and are a natural fit to the Spark programming paradigm.  In this talk I will present both the right way as well as the wrong way to implement collaborative filtering models with Spark.  Additionally, I will deep dive into how Matrix Factorization is implemented in the MLlib library.

BIO : Chris Johnson is a Machine Learning dude at Spotify who hacks on music data and works on their music recommendation engine.  Prior to Spotify Chris was pursuing a PhD at UT Austin.

The rest of the agenda is up for grabs, feel free to submit an idea!!!

Join or login to comment.

  • Chris F.

    @francois: is the video archives available? i tried to find them using the livestream link/password that you provided, but no go.

    (also, apologies if i spammed the list earlier this evening. apparently, "return" means "post" in some contexts here on!)

    1 · May 12, 2014

  • Nitin k.

    Reynold... good presentation. It would be good if you ca cover the differences and also any benchmarks between storm-trident and spark streaming.

    May 11, 2014

  • Reynold X.

    I'd be happy to talk about other topics.

    April 18, 2014

    • Infant Rosario V.

      I would like to really hear about spark streaming vs storm

      3 · May 7, 2014

    • Susheel K.

      Hi Reynold, if you can share your presentation and also difference between spark streaming vs storm

      1 · May 11, 2014

  • Susheel K.

    Like it

    May 9, 2014

  • Chris J.

    Thanks everyone for coming. It was great meeting you all! Here are my slides from the event:

    2 · May 9, 2014

  • A former member
    A former member

    Any links to presentations?

    3 · May 8, 2014

  • narendra

    Nice session on spark , thanks to speakers and organizers @Spotify

    1 · May 8, 2014

  • Ruze

    Amazing speakers - one of the founders of Spark

    May 7, 2014

  • vijay v.

    Best new meetup - loved the first session. Both presentations were very helpful.

    May 7, 2014

  • François Le L.

    Livestream of the event starting soon: Link:
    Password: Hkdl3i8lJ

    3 · May 7, 2014

  • Andreas R.

    Are the presentations going to be streamed or made available afterwards?

    3 · April 29, 2014

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy