addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscontroller-playcrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1light-bulblinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonprintShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Introduction to Spark

From Matei Zaharia:

"As big data becomes a concern for more organizations, there is a need for both faster tools to process it and easier-to-use APIs. Apache Spark is a Hadoop-compatible cluster computing engine that addresses these needs through (1) in-memory computing primitives that let it run 100x faster than Hadoop and (2) high-level APIs in Scala, Java and Python. In the past few years, Spark has quickly grown to be one of the most active projects in the big data space, with over 25 companies contributing, and a developer community second in size only to Hadoop. This talk will introduce the Spark programming model and API, show you how to get started using it, and talk about use cases in the community. Finally, we’ll cover the growing stack of higher-level tools built on top of Spark, including Spark Streaming for real-time processing, Shark for SQL, GraphX, and MLlib."

Pizza and drinks will be provided.

About the speaker:

Matei Zaharia is the creator of Apache Spark and is joining MIT CSAIL as an assistant professor next year. He recently completed his PhD at UC Berkeley, during which he worked closely with the open source big data ecosystem, becoming a committer on Apache Mesos and Hadoop. He is currently on leave to start Databricks, a company built around Spark, where he is CTO.

Join or login to comment.

  • Matei Z.

    Thanks everyone for coming! For those wanting the slides, here they are:

    November 23, 2013

  • Perry S.

    The speaker, Matei, was great - very knowledgeable, eager to share, excited about his subject, and yet grounded. He had a great set of material, both slides and a demo running against an EC2 cluster. He also did a good job going through the material and also taking questions as they came up. A really solid intro to Spark.

    November 23, 2013

  • Qingxin W.


    November 22, 2013

  • Gary M.

    Really enjoyed Matei's presentation. I think there was a nice flow to it that clearly delineated the not only the problem it solves but the software's advantages against the other options on the market.

    November 22, 2013

  • Dan C.

    Matei's presentation was well-organized and very informative.
    Please post a link to the slides.

    November 21, 2013

  • Larry

    This guy did GREAT. He was prepared, no last minute equipment problems, good combination of examples with overview.

    November 21, 2013

  • Eugene B.

    A very informative introduction to Spark from its creator.

    November 21, 2013

  • David

    If you're interested in Big Data, you should definitely attend this upcoming Amazon Web services DAMA presentation.

    Please be advised of this important Meetup: Amazon Web Services: Designing for Scale, December 16th at the Microsoft NERD Center in Cambridge.

    November 19, 2013

  • Gary M.

    Excited to attend! Our team here in Boston started using Spark 6 months ago and are finding it very enjoyable.

    November 14, 2013

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy