Life after Deployment: Maintaining Models in Production (w/ NVIDIA, Bloomberg)


6:30pm: Pizza + Beer networking
7:00pm: 10-minute talks from NVIDIA, Bloomberg, and Dataiku
7:30pm: Open Q&A

Putting models into production is often seen as the completion of machine learning projects - but what happens post-deployment? This meetup will focus on this often underappreciated (and unpredicted) side of machine learning, addressing how models evolve and tackling the organizational and engineering challenges of maintenance, such as managing technical debt and compiling complexity.

In a series of 10-minute talks with Twitter, Bloomberg, and Dataiku, we will discuss different industry approaches to model maintenance. These talks will be followed by a Q&A panel with all of the speakers - come ready with a question or two!

Challenges and learnings to bring research into production by Nicolas Koumchatzky, NVIDA:
As a manager or a large team working on ML problems, one issue I encountered was the difficulty to translate valuable research work into production systems. Researchers need flexibility and fast iteration, while production systems need safety, scale and robustness. In this talk, we will go over a few of the experiences I had over the years, and what I tried to do to improve the situation, with various degrees of success.

System Design v. Infrastructure by Michael Burkholder, Bloomberg:
When developing machine learning models for production, there is no one-size-fits-all recipe for system design. It's important to consider the engineering and product objectives early on in model development lifecycle, and to spend effort developing supporting infrastructure. In this talk, I will give an anecdotal case-study illustrating the modeling tradeoffs faced while developing an ML platform to compute the prices of financial instruments, and the infrastructure requirements to support the platform.

Exploring and Preventing Technical Debt by Patrick Masi-Phelps, Dataiku:
Patrick will discuss the concept of technical debt in production machine learning projects, a concept refined in 2015 by Google researchers (Sculley et. al), building on the software engineering concept introduced in the 1990s. Taking the extra time to simplify pipelines and account for changes in model inputs and configuration parameters can save time and mitigate risks of models in production. He'll talk theory then present a couple examples from clients in healthcare and aviation.

Nicolas Koumchatzky started as a Quant, using models to evaluate the price of complex financial derivatives. He quickly joined a startup called DerivExperts in Paris to deliver that service to third-party buyers. After spending 5 years there as a manager, he embarked into another startup adventure at Madbits, focused on deep learning for image+text search, which was promptly acquired by Twitter. There, he developed deep learning models for image and spam filtering, moving on to create the first iteration of the first deep learning platform at Twitter called DeepBird. He then became a manager for the Twitter Cortex team, developing the ML platform with automation, better recommender systems and an improved version of the deep learning platform. A year ago, he joined NVIDIA as a Director of AI Infrastructure to build an ML platform to develop self-driving cars.

Michael Burkholder received his PhD in Mechanical Engineering from Carnegie Mellon University, studying nonlinear, chaotic, and stochastic electrochemical systems. He leads an ML team at Bloomberg LP developing high-performance models and infrastructure to power Bloomberg's risk analysis engine. Michael enjoys roasting his own coffee and listening to vinyl.

Patrick Masi-Phelps is a Data Scientist at Dataiku, where he helps clients build and deploy predictive models. Before joining Dataiku, he studied math and economics at Wesleyan University and was a fellow at NYC Data Science Academy. Patrick is always keeping up with the latest ML techniques in astronomical and public policy research.