addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrosseditemptyheartexportfacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Large-Scale Analytics with Apache Spark

Abstract: Is Apache Spark the answer to all my big data problems? What distinguishes Spark from Hadoop? Do I have to become a Scala expert in order to use Spark? Can I do large-scale machine learning with Spark?
In this talk, I will answer these and other questions based on our experience at Thomson Reuters R&D with the MapReduce framework Spark. After a short introduction presenting the underlying technology, I will show how Spark can help with your data analysis tasks. I will discuss the various recent extensions including GraphX, SparkSQL, and in particular MLLib, the Spark library for machine learning (ML). The talk will conclude with a comparison between Spark's ML capabilities and other frameworks (e.g., Mahout, H2O).

Speaker:  Frank Schilder, from Thomson Reuters,  obtained his Ph.D. in Cognitive Science from the University of Edinburgh, Scotland. His research interests include discourse analysis, summarization and information extraction. His summarization work has been implemented as the snippet generator for search results of WestlawNext and he is currently involved in various large-scale machine learning projects. Frank has successfully participated in several research competitions on automatic summarization systems such as the Text Analysis Conference (TAC) carried out by the National Institute of Standards and Technology (NIST). Before joining Thomson Reuters, he was employed by the Department for Informatics at the University of Hamburg, Germany, as an assistant professor.

Parking: There are two options to pay for parking in the adjacent Anderson ramp.  You can either enter/exit with a credit card, or you can take a ticket and use the pay kiosk on the northeast corner of the ramp to get an exit ticket.

Food: Pizza and drinks, first come first serve, starting at 6:30PM provided by Cloudera.

Map: http://bit.ly/RCtaTI

Join or login to comment.

  • Brad R.

    The slides are now in the Meetup Files area (it is under More above). The screencast (slides and synchronized audio) will be available for a few weeks at https://www.dropbox.com/s/zkpadx4lowiquqo/Large-Scale%20Analytics%20with%20Apache%20Spark.mp4?dl=0

    September 24, 2014

  • Ryan B.

    Thanks to Frank Schilder for a great presentation! The slides are posted here: http://www.slideshare.net/RyanBosshart/spark-meetup-tchug. We should have a recording coming soon!

    2 · September 23, 2014

  • Sona

    Fantastic presentation !!! Coursera is offering a class on functional programming with Scala. https://www.coursera.org/course/progfun
    It started on the 15th, there is still time to enroll.

    4 · September 23, 2014

  • Dan M.

    Great presentation. I liked the way Frank pointed out that there are some functions that you can not do with a one-pass map reduce transform. Graph traversal and light-weight inference are great examples of where RDDs rock!

    September 23, 2014

  • Ken W.

    I'll be happy to get the slides/recording too, I really wanted to come but I'm on kid duty and I couldn't get a sitter.

    September 22, 2014

  • Andy W.

    Could someone post the sides on meet up?

    September 22, 2014

  • George S

    Good meet

    September 22, 2014

  • David L.

    Meetup's GPS coordinates are on the other side of campus. Look at the campus map in the talk description.

    September 22, 2014

  • Brad R.

    See Parking information above.

    1 · September 22, 2014

    • Jeff W.

      Got it now, the parking info did not show in the meetup mobile app.

      September 22, 2014

  • Jeff W.

    Any special parking concerns since school is back in session?

    September 22, 2014

  • Brad R.

    We intend to capture the slides w/audio and post it later.

    September 22, 2014

  • One Ä.

    Can somebody capture this Meetup with a video camera? I want to attend but work interferes.

    September 22, 2014

Our Sponsors

  • U of St. Thomas, Graduate Programs in SW

    The Center of Excellence for Big Data (CoE4BD) provides meeting space.

  • Cloudera

    Cloudera funds this Meetup site.

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy