From research-grade deep learning models to pratical production systems


Details
Achieving the transition from research-grade deep learning models to production systems presents considerable technical challenges. Even more so if production needs to support over a billion ML predictions per second. Learn the tips and tricks used at Outbrain!
So once again, mark your calendars, free your GPUs, and join us for this discussion on the topic of deep learning, with the same excitement as an optimizer guiding a new network toward a local minimum!
---
Samo Pahor
Bridging the Gap: From Offline Research to Production
In this talk, we will go over the steps of our model productization process, which includes selecting a model architecture, offline dataset preparation and model feature exploration, feature engineering, and extensive hyperparameter optimization using our in-house AutoML framework. Furthermore, we will discuss how we perform both offline and online evaluations and how we typically set up and scale our A/B tests to fairly measure final model performance.
Prehod iz lokalnega v produkcijo
V predstavitvi bomo opisali kako naše napovedne modele pripeljemo do produkcijskega stanja. Koraki vključujejo izbiro arhitekture modela, pripravo podatkov za učenje, izbiro uporabnih značilk za napovedovanje, zasnovo novih značilk in globoko preiskovanje hiperparametrov modela z namensko razvitim AutoML orodjem. Prav tako bomo predstavili evalvacijo modelov tako lokalno kot v produkcijskem okolju ter omenili kako izvajamo A/B testiranje na globalni skali.
Samo is a data scientist/machine learning engineer at Outbrain, working on conversion prediction, focusing on state-of-the-art predictive model training, and improving existing machine learning pipelines.
---
Blaž Škrlj
Advancing Large-Scale Inference through Model Quantization
This presentation provides a detailed examination of the methodologies implemented to refine the training, weight propagation, and inference phases of machine learning models in a production environment. We devote particular attention to model quantization; this entails the deliberate reduction of weight precision at strategic points within the operational workflow to significantly enhance performance. We present an in-house quantization algorithm that enables a considerable decrease in the bandwidth necessitated for transporting models between training environments and geographically dispersed data centers (serving). We will overview the methods used and share the considerable impact this has had on operational efficiency and model deployment efficacy.
Kvantizacija modelov strojnega učenja in učinkovito serviranje
V predstavitvi bomo naslovili več tehnik povezanih z učenjem modelov strojnega učenja, prenosom uteži ter serviranjem. Ena ključnih točk predavanja bo kvantizacija modelov. Tovrstni pristopi omogočajo tarčno znižanje numerične natančnosti v določenih delih sistema, z namenom pospešitve npr. prenosa uteži. Predstavili bomo enega naših algoritmov za kvantizacijo, kako smo ga skalirali na več podatkovnih centrov po svetu, ter kakšen je bil efekt kvantizacije v produkciji.
Blaž is a machine learning engineer at Outbrain's research Infra team, focusing on the intersection between infra-level optimizations and machine learning research.
---
The meetup sponsor and the event space provided is Outbrain with complimentary drinks and pizza in their offices in Ljubljana.
Who? Neural Net Enthusiasts
When? Thr, Mar 14 @ 18h
Where? Dunajska c. 5, 1000 Ljubljana

From research-grade deep learning models to pratical production systems