Upcoming Apache Spark 2.4 and MLflow: new tools for Big Data and ML


Details
Agile Lab and Databricks will hold a meet up about the upcoming Apache Spark 2.4 and MLflow, special guest Tim Hunter from Databricks.
Free beers at the end of the meet up @Birrificio Lambrate Golgi!!
Tim Hunter by Databricks will give us an overview of the latest developments in Spark and of a recent Databricks project for simplifying machine learning.
Short Bio of Tim:
Tim Hunter is a software engineer at Databricks and contributes to the Apache Spark MLlib project, as well as the GraphFrames, TensorFrames and Deep Learning Pipelines libraries. He has been building distributed Machine Learning systems with Spark since version 0.0.2, before Spark was an Apache Software Foundation project.
Abstract:
- Features in Upcoming Apache Spark 2.4: What’s New & Why Should You Care
This talk will provide an overview of the major features and enhancements in this upcoming release and will be followed by a Q&A session. The soon to be released Apache Spark 2.4 comes packed with a lot of new functionalities: new scheduling model, the native AVRO data source, PySpark's eager evaluation mode, Kubernetes support, and a lot of other improvements.
- MLflow is a new open source project from Databricks that simplifies this process.
Successfully building and deploying a machine learning model can be difficult to do once. Enabling other data scientists (or yourself, one month later) to reproduce your pipeline, to compare the results of different versions, to track what's running where, and to redeploy and rollback updated models is much harder.
MLflow provides APIs for tracking experiment runs between multiple users within a reproducible environment, and for managing the deployment of models to production. Moreover, MLflow is designed to be an open, modular platform, in the sense that you can use it with any existing ML library and incorporate it incrementally into an existing ML development process.

Upcoming Apache Spark 2.4 and MLflow: new tools for Big Data and ML