Automated pipeline solution to scaling models for ML hosting


Details
Shreeshankar Chatterjee Presents:
Intuit is investing on building robust Machine Learning (ML) platforms that solves for the problems of training, deploying hosting ML Models at scale, often leveraging public cloud service offerings like AWS SageMaker. Data Scientist community work on the notion of ‘Bring Your Own Model’ to the ML-Platform, with every model version having unique compute/memory footprint characteristics for training and hosting infrastructure, with critical cost ramifications for operating 24X7 at enterprise scale. This talk highlights patterns for automated pipeline solution to generate both vertical and horizontal scaling models for ML hosting, that not only optimizes for performance but also cloud operating costs. We will discuss greedy selection algorithms to automate design and management of performance experiments, related simulation and scoring systems. The solution helps predict capacity and cloud operating cost models , as part of reliable self-service ML-Model onboarding workflow into ML-Platform. The patterns and techniques discussed are applicable in general for hands-off automated infrastructure modeling and cost prediction problems in continuous delivery pipelines for cloud native systems.

Automated pipeline solution to scaling models for ML hosting