Serving a large number of ML models at low latency

Name: Serving a large number of ML models at low latency
Start: 2021-03-23T11:50:00-07:00
End: 2021-03-23T13:50:00-07:00

Hosted by Murat B.

Silicon Valley Data Science, ML, AI Platform

Details

Serving machine learning models is a scalability challenge at many companies. Most of the applications require a small number of machine learning models (often <100) to serve predictions. On the other hand, cloud platforms that support model serving, though they support hundreds of thousands of models, provision separate hardware for different customers. Salesforce has a unique challenge that only very few companies deal with, Salesforce needs to run hundreds of thousands of models sharing the underlying infrastructure for multiple tenants for cost-effectiveness.

In this talk we will explain how Salesforce hosts hundreds of thousands of models on a multi-tenant infrastructure, to support low-latency predictions.

Agenda:

11:40 am - 11:50 am Arrival and socializing
11:50 am - 12:00 pm Opening
12:00 pm - 1:50 pm Manoj Agarwal, "Serving a large number of ML models at low latency"
1:50 pm - 2:00 pm Q&A

About: Manoj Agarwal

Manoj Agarwal is a Software Architect in the Einstein Platform team at Salesforce. He has almost 25 years of experience in the industry, building distributed systems, public cloud services and machine learning platforms.

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/WN_pfMTStyVQ3mrBVhzKihKBg

Webinar ID: 811 6053 3641

Serving a large number of ML models at low latency

Silicon Valley Data Science, ML, AI Platform

Details

You may also like