We've got ourselves a pretty awesome event with both a great Apache Spark workshop by Amanda Casari and Deborah Siegel AND a great presentation by Tim Hunter on Combining ML Frameworks with Apache Spark including Scikit-Learn and TensorFlow!
• 6:00pm-6:20pm: Networking
• 6:20pm-6:30pm: Introductions
• 6:30pm-7:00pm: Combining ML Frameworks with Apache Spark
• 7:00pm-8:15pm: Jump Start into Apache Spark Workshop
• 8:15pm-8:30pm: Finale
Jump Start into Apache Spark Workshop (Eastside)
We will be running a Jump Start into Apache Spark Workshop on the Eastside to help you learn Apache Spark. We ask that for those who want to attend, please sign up for Databricks Community Edition (http://go.databricks.com/databricks-community-edition-beta-waitlist) and bring your laptops so we can work on this workshop together. Included for this workshop are:
• Intro to Spark's Architecture, Modules, and Language APIs
• How to start prototyping and exploring data with Spark in notebooks
• Code Examples to Work with
Amanda Casari, Concur and Deborah Siegel, Deborah Siegel Northwest Genomics Center
Combining Machine Learning Frameworks with Apache Spark
Machine Learning (ML) workflows involve a sequence of processing and learning stages. Realistic workflows combine specialized libraries with more general data management workflows.
Apache Spark is well-known as a powerful platform to perform iterative computations required for ML. This talk presents how to combine the strengths of Spark’s ML library (MLlib) with popular packages such as scikit-learn and TensorFlow. Scikit-learn is the de facto standard ML library for Python, and TensorFlow is a library for deep learning recently open-sourced by Google.
We also discuss the improvements of MLlib in Spark 2.0 and the future of MLlib’s APIs. On the roadmap are both more algorithms and features for users, and more utilities and abstractions to aid developers.
Tim Hunter, Databricks