Skip to content

Workshop: Spark and Machine Learning

Photo of Tony Tran
Hosted By
Tony T. and 3 others
Workshop: Spark and Machine Learning

Details

• IMPORTANT!!! For security purposes, PLEASE MAKE SURE YOU PROVIDE YOUR FIRST AND LAST NAME WHEN YOU REGISTER FOR THIS EVENT (there will be a prompt for it).

Speaker:

Joseph Bradley ( http://www.cs.cmu.edu/~jkbradle/ )

Joseph is currently a Software Engineer at Databricks (http://databricks.com/). Previously, he was a postdoc working with Kannan Ramchandran (http://www.eecs.berkeley.edu/~kannanr/) and Martin Wainwright (http://www.eecs.berkeley.edu/~wainwrig/) at UC Berkeley (http://www.berkeley.edu/index.html). Joseph received his Ph.D. in Machine Learning (http://www.ml.cmu.edu/) from Carnegie Mellon University (http://www.cmu.edu/index.shtml), where he worked with Carlos Guestrin (http://homes.cs.washington.edu/~guestrin/) in the Select Lab (http://www.select.cs.cmu.edu/). He received my B.S.E. in Computer Science from Princeton University (http://www.princeton.edu/), where he did research with Robert E. Schapire (http://www.cs.princeton.edu/~schapire/).

Description:

Joseph will talk about Machine Learning with Spark, focusing on the decision tree and (upcoming) random forest implementations in MLlib. Spark has been established as a natural platform for iterative ML algorithms, and trees provide a great example. This talk aims both to give insight into the underlying implementation and to highlight best practices for using MLlib.

We'll start with how decision trees fit into Spark's computational framework. This deeper understanding will facilitate a discussion of performance, scaling, algorithmic optimizations, and tuning. Finally, we will mention random forests (coming soon to Spark). We'll use plenty of examples of learning trees on Spark clusters.

Tentative Schedule:

6:30pm - 7:00pm -- socializing

7:00pm - 7:10pm -- word from our host

7:15pm - 8:15pm -- main talk (Joseph Bradley)

8:15pm - 9:00pm -- socializing

Shout outs:

• This is a joint event with our friends over from SF Scala (https://www.meetup.com/SF-Scala)

• Special thanks to DataBricks (http://www.databricks.com) for helping us put this event together on short notice.

• Special thanks to Yammer (https://www.yammer.com/) for helping us host this event!

If you want to learn more about Spark + ML ...

An Academy By the Bay (brought to you by our friends from SF Scala) will have a follow up event (http://bythebay.ticketleap.com/deep-learning-september-2014/) focused on Spark and Machine Learning. It will be a professional training course on Distributed Deep Learning focusing on deeplearning4j, Scala, Akka and Spark. Here's the link for it (link (http://bythebay.ticketleap.com/deep-learning-september-2014/))

Photo of SF Bayarea Machine Learning group
SF Bayarea Machine Learning
See more events
Yammer HQ
1355 Market St · San Francisco, CA