Apache Spark is a popular big data processing engine and ecosystem with a strong and growing community. This group is for people in the greater Denver area interested in learning more about Spark's capabilities, use cases, and integration with other Big Data tools as well as engaging with the larger Spark community.
In this workshop, we’ll cover best practices for enterprises to use powerful open source technologies to simplify and scale your ML efforts. We’ll discuss how to leverage Apache Spark™, the de-facto data processing and analytics engine in enterprises today, for data preparation as it unifies data at massive scale across various sources. You’ll learn how to use ML frameworks (i.e. Tensorflow, XGBoost, Scikit-Learn, etc.) to train models based on different requirements. And finally, you can learn how to use MLflow to track experiment runs between multiple users within a reproducible environment and manage the deployment of models to production.
Join this half-day workshop on March 14th in Denver to learn how unified analytics can bring data science and engineering together to accelerate your ML efforts. This free workshop will give you the opportunity to:
Learn how to build highly scalable and reliable pipelines for analytics
Deeper insight into Apache Spark and Azure Databricks, including the latest updates with Databricks Delta
Train a model against data and learn best practices for working with ML frameworks (i.e. - XGBoost, Scikit-Learn, etc.)
Learn about MLflow to track experiments, share projects and deploy models in cloud and on-prem
Network and learn from your ML and Apache Spark peers
AGENDA AT A GLANCE
8:30-9:00 Registration, Breakfast & Networking
9:00-9:45 Opening Remarks - Unifying Data Science and Data Engineering
9:45-10:15 Customer Use Case
10:15-10:45 Networking with Peers
10:45-11:30 Data Engineering Interactive Demo & Best Practices: Preparing Data for Analytics
11:30-12:15 Data Science Interactive Demo & Best Practices: Model Training and Machine Learning