Apache Spark Workshop and Combining ML Frameworks with Apache Spark

This is a past event

184 people went

Concur Technologies

601 108th Ave Ne # 1000 · Bellevue

How to find us

The Concur Conference Center is on the first floor of the Concur Building across from Jimmy Johns. Paid parking is available in the building; nearby street parking.

Location image of event venue


We've got ourselves a pretty awesome event with both a great Apache Spark workshop by Amanda Casari and Deborah Siegel AND a great presentation by Tim Hunter on Combining ML Frameworks with Apache Spark including Scikit-Learn and TensorFlow!


• 6:00pm-6:20pm: Networking

• 6:20pm-6:30pm: Introductions

• 6:30pm-7:00pm: Combining ML Frameworks with Apache Spark

• 7:00pm-8:15pm: Jump Start into Apache Spark Workshop

• 8:15pm-8:30pm: Finale

Jump Start into Apache Spark Workshop (Eastside)

We will be running a Jump Start into Apache Spark Workshop on the Eastside to help you learn Apache Spark. We ask that for those who want to attend, please sign up for Databricks Community Edition (http://go.databricks.com/databricks-community-edition-beta-waitlist) and bring your laptops so we can work on this workshop together. Included for this workshop are:

• Intro to Spark's Architecture, Modules, and Language APIs

• How to start prototyping and exploring data with Spark in notebooks

• Code Examples to Work with


Amanda Casari, Concur and Deborah Siegel, Deborah Siegel Northwest Genomics Center

Combining Machine Learning Frameworks with Apache Spark

Machine Learning (ML) workflows involve a sequence of processing and learning stages. Realistic workflows combine specialized libraries with more general data management workflows.

Apache Spark is well-known as a powerful platform to perform iterative computations required for ML. This talk presents how to combine the strengths of Spark’s ML library (MLlib) with popular packages such as scikit-learn and TensorFlow. Scikit-learn is the de facto standard ML library for Python, and TensorFlow is a library for deep learning recently open-sourced by Google.
We also discuss the improvements of MLlib in Spark 2.0 and the future of MLlib’s APIs. On the roadmap are both more algorithms and features for users, and more utilities and abstractions to aid developers.


Tim Hunter, Databricks