In this talk we describe our efforts, as part of the MLbase project, to develop a distributed Machine Learning platform on top of Spark. In particular, we present the details of two core components of MLbase, namely MLlib and MLI, which are scheduled for open-source release this summer. MLlib provides a standard Spark library of scalable algorithms for common learning settings such as classification, regression, collaborative filtering and clustering. MLI is a machine learning API that facilitates the development of new ML algorithms and feature extraction methods. As part of our release, we include a library written against the MLI containing standard and experimental ML algorithms, optimization primitives and feature extraction methods.
The talk will be delivered by Ameet Talwalkar and Evan Sparks.
To come to this event, you MUST RSVP and BRING A PHOTO ID. And "doors close" at 7:15pm.
Directions/Parking Guide From Twitter
We're located at 1355 Market St. (between 9th Street and 10th Street) on the 9th floor in the old SF Furniture Mart.
If you're taking BART or Muni, exit at Civic Center and walk up Market St (5-10 min walk). There is no bike storage at our office. If you are bringing a bike you must lock on the street at your own risk. There are several bus lines (the 6, 71, 21, etc.) that stop right outside of our building. If you're taking Caltrain, get off at the 4th and King station and you can take the Muni 83X bus which goes directly to our office (83X bus runs about every 20-25 mins). Coming by car? There is plenty of street parking nearby. There are also a few parking garages near the office. The Civic Center parking garage entrance is on McAllister St, between Polk St. and Larkin St. The lot is open until 12:00 AM. The Fox Plaza parking garage entrance is on Hayes St, north of Market Street on your left. The lot is open until 8:00 PM.
Ameet Talwalkar is an NSF post-doctoral fellow in the Computer Science Division at UC Berkeley. His work focuses on devising scalable machine learning algorithms, and more recently, on interdisciplinary approaches for connecting advances in machine learning to large-scale problems in science and technology. He obtained a bachelor's degree from Yale University and a Ph.D. from the Courant Institute at New York University.
Evan Sparks is a PhD student in the Computer Science Division at UC Berkeley. His research focuses on the design and implementation of distributed systems for large scale data analysis. Prior to Berkeley he spent several years in industry tackling large scale data problems as a Quantitative Financial Analyst at MDT Advisers and as a Product Engineer at Recorded Future. He holds a bachelor's degree from Dartmouth College.