Productionalizing ML at-scale with MLFlow and H2O Sparkling Water

Are you going?

59 people going

Share:
Location image of event venue

Details

Hello Makers!

Join us this evening to hear from Alvaro Viloria of Groupon on how they are productionalizing machine learning at scale with MLFlow and H2O Sparkling Water.

Following is a brief agenda for the evening:

6:00 - 6:30 PM: Doors open for networking and pizza

6:30 - 7:15 PM: Alvaro's talk

7:15 - 7:30 PM: Q&A

Description:

The conceptual workflow of applying machine learning (ML) to any specific use case is simple: at the training phase, the learning component takes a dataset as input and builds a learned model; at the serving phase, the model takes features as input and yields predictions. However, the actual workflow becomes more complex when ML models need to be set up in a production environment. This will require a careful orchestration of several components to reliably produce, deploy and evaluate such models. At Groupon, the ranking recommendation system is based on supervised ML models. Once a model is promoted from candidate to released, and start serving real-time traffic, it opens the following questions: How can we assure that the model is not losing its prediction power? How can we reliably keep track of all released models life cycle? To answer these questions, each new model is built on an ML pipeline that guarantees its standardization, transparency, reproducibility and reliable evaluation. For this purpose, at Groupon we built a custom made ML-pipeline, using a simple but powerful integration between MLFow and H2O sparkling water. Every model during its training step publishes all its information, such as output values, hyperparameters, evaluation metrics, features, queries, etcetera, into MLflow as the main Model Registry. As a final step of the ML Pipeline, every released model is evaluated with fresh data, by applying a sequence of orchestrated steps. Each released model retrieves its metadata from MLflow and is evaluated by using the same constraints over the data, so as to assure a reliable evaluation. Finally, the variations on the predictive power of the model are visualized using Kibana, to constantly monitor any sign of decay.

Alvaro's Bio:
Alvaro is a Software Engineer at Groupon, he can be reached on Linkedin here: https://www.linkedin.com/in/alvaro-viloria-97b4b725/

Looking forward to meeting you all!