Meetup #4 - Productionizing Machine Learning with Delta Lake, Koalas, and MLflow

Name: Meetup #4 - Productionizing Machine Learning with Delta Lake, Koalas, and MLflow
Start: 2019-10-03T18:00:00-04:00
End: 2019-10-03T20:00:00-04:00
Location: Kinaxis

Hosted by Marshall B.

Toronto Apache Spark TAS 2.0

Details

Daniel Arrizza: www.linkedin.com/in/danielarrizza

Daniel is a Customer Success Engineer at Databricks.

For many data scientists, the process of building and tuning machine learning models is only a small portion of the work they do every day. The vast majority of their time is spent doing the less-than-glamorous (but crucial) work of performing ETL, building data pipelines, and putting models into production.

In this session, we’ll walk through the process of building a production data science pipeline step-by-step. Using open-source tools we will:

Walkthrough querying a data lake with Apache Spark™ and Delta Lake
Transforming the data with Koalas (distributed PySpark using the pandas API)
Perform machine learning experiments with hyperparameter tuning (Hyperopt), and
Log our experiment results to MLflow.

____________

Schedule:
6:00pm - Check-in, Socialize & Eat Pizza
6:30pm - Productionizing Machine Learning with Delta Lake, Koalas, and MLflow
7:30pm - Q&A
7:55pm - Meetup Conclusion
____________

Toronto Apache Spark TAS 2.0

Meetup #4 - Productionizing Machine Learning with Delta Lake, Koalas, and MLflow

Toronto Apache Spark TAS 2.0

Details

Related topics

You may also like