Advanced Data Science on Spark


Details
Guest Speaker: Reza Zadeh
Overview:
We discuss how to combine the scalability of Spark with machine learning and graph processing. This talk covers a subset of material covered in Stanford’s CME 323: Distributed Algorithms and Optimization. Lessons focus on building and using machine learning at scale via MLlib and GraphX. Topics covered include:
- Building scalable Machine Learning algorithms on Spark, discussing design decisions inside MLlib and GraphX
- Understand how primitives like Matrix Factorization are implemented in a distributed framework from the designers of MLlib
CME 323: http://stanford.edu/~rezab/dao
Bio: http://stanford.edu/~rezab/bio.html
Note: this event is cross-listed with the new Spark User Group:
https://www.meetup.com/Toronto-Apache-Spark/events/224035398/
I am not setting RSVP limits as this is a cross-listed event with the Spark Meetup Group. Please RSVP to at least one of the events let us know you are coming, we can union the lists. Be warned that there will be some people who will not have a seat. For those who do find a seat please be kind and be ready to give it to those who need it more than you do.
Food and Drinks will be provided.

Advanced Data Science on Spark