Joint Spark London and Machine Learning Meetup

Name: Joint Spark London and Machine Learning Meetup
Start: 2014-06-19T19:00:00+01:00
End: 2014-06-19T22:00:00+01:00
Location: Royal Statistical Society

Hosted By

Martin G.

Joint Spark London and Machine Learning Meetup

Details

We are hosting a joint meetup between Spark London and Machine Learning London. Given the excitement in the machine learning community around Spark at the moment a joint meetup is in order!

Michael Armbrust from the Apache Spark core team will be flying over from the States to give us a talk in person. Thanks to our sponsors, Cloudera, MapR and Databricks for helping make this happen.

The first part of the talk will be about MLlib, the machine learning library for Spark, and the second part, on Spark SQL.

Don't sign up if you have already signed up on the Spark London page though!

Abstract for part one:

In this talk, we’ll introduce Spark and show how to use it to build fast, end-to-end machine learning workflows. Using Spark’s high-level API, we can process raw data with familiar libraries in Java, Scala or Python (e.g. NumPy) to extract the features for machine learning. Then, using MLlib, its built-in machine learning library, we can run scalable versions of popular algorithms. We’ll also cover upcoming development work including new built-in algorithms and R bindings.

Abstract for part two:

In this talk, we'll examine Spark SQL, a new Alpha component that is part of the Apache Spark 1.0 release. Spark SQL lets developers natively query data stored in both existing RDDs and external sources such as Apache Hive. A key feature of Spark SQL is the ability to blur the lines between relational tables and RDDs, making it easy for developers to intermix SQL commands that query external data with complex analytics. In addition to Spark SQL, we'll explore the Catalyst optimizer framework, which allows Spark SQL to automatically rewrite query plans to execute more efficiently.

Events in EC1Y 8LX, GB