Advanced Data Science on Spark


Details
Guest Speaker: Reza Zadeh
Overview:
We discuss how to combine the scalability of Spark with machine learning and graph processing. This talk covers a subset of material covered in Stanford’s CME 323: Distributed Algorithms and Optimization. Lessons focus on building and using machine learning at scale via MLlib and GraphX.
Topics covered include:
• Building scalable Machine Learning algorithms on Spark, discussing design decisions inside MLlib and GraphX
• Understand how primitives like Matrix Factorization are implemented in a distributed framework from the designers of MLlib
CME 323: http://stanford.edu/~rezab/dao (http://stanford.edu/%7Erezab/dao)
Bio: http://stanford.edu/~rezab/bio.html (http://stanford.edu/%7Erezab/bio.html)
Toronto Hadoop User Group (THUG) and Toronto Apache Spark meetup group are both listing and organizing this event: https://www.meetup.com/TorontoHUG/events/224677876/
Please RSVP to at least one of the events and we will union the list of participants.
Note:
• Please be aware that there is no limit for RSVP but the number of sits are limited.
• Please inform us in advance if you require any assistance because of mobility issues.
• Parking: some across the street in the green p. Close access to St. Andrew TTC Station.
Sponsors:
http://photos1.meetupstatic.com/photos/event/4/9/5/b/600_441018779.jpeg
Gurulink (http://gurulink.ca/) is providing food and drink for the attendees.
http://photos2.meetupstatic.com/photos/event/4/9/8/c/600_441018828.jpeg
Paytm Labs (http://paytmlabs.com/) is generously offering their office space for this event.

Advanced Data Science on Spark