Skip to content

PyData Montreal #15: online event

Photo of Maria Khalusova
Hosted By
Maria K.
PyData Montreal #15: online event

Details

Agenda:
(All times in EST)
6:00 pm — Introductions
6:10 pm — "Continuous integration for Machine Learning" by Elle O'Brien
6:50 pm — Q&A

7:05 pm — 5 min break

7:10 pm — "Testing production Machine Learning systems" by Josh Tobin
7:50 pm — Q&A
8:00 pm — Wrap-up

"Continuous integration for Machine Learning"

Abstract:
Machine learning is maturing as a discipline: now that it’s trivially easy to create and train models, it’s never been more challenging to manage the complexity of experiments, changing datasets, and the demands of a full-stack project. In this talk, we’ll examine why one of the staples of DevOps, continuous integration, has been so challenging to implement in ML projects so far and how it can be done using open-source tools like Git, GitHub Actions, and DVC (Data Version Control). We'll also discuss a new open source project (Continuous Machine Learning) we've created to adapt popular continuous integration systems like GitHub Actions and GitLab CI to data science projects.

About Elle O'Brien:
Elle is a data scientist at Iterative, a startup building open source software tools for machine learning, and a lecturer at the University of Michigan School of Information. She completed her PhD at the University of Washington where she conducted research on speech and hearing using mathematical models. Elle is broadly interested in developing methods, standards, and educational resources for anyone who works with data.

"Testing production Machine Learning systems"

Abstract:
Testing is a critical part of the software development cycle. As your software project grows, dealing with bugs and regressions can consume your team if you do not take a principled approach to testing. As a result, software testing methodologies are well-studied. However, machine learning models introduce a new set of complexities beyond traditional software. In particular, machine learning models depend on data in addition to code. As a result, testing methodologies for machine learning systems are less understood and less widely implemented in practice. In this talk, we argue for the importance of testing in ML, give an overview of the types of testing available to ML practitioners, and make recommendations about how you can start to incorporate more robust testing into your ML projects.

About Josh Tobin:
Josh Tobin is the founder and CEO of a stealth machine learning startup. Previously, Josh worked as a deep learning & robotics researcher at OpenAI and as a management consultant at McKinsey. He is also the creator of Full Stack Deep Learning (fullstackdeeplearning.com), the first course focused on the emerging engineering discipline of production machine learning. Josh did his PhD in Computer Science at UC Berkeley advised by Pieter Abbeel.

Photo of PyData Montreal group
PyData Montreal
See more events