Skip to content

Reproducible Machine Learning

Photo of Pramit Choudhary
Hosted By
Pramit C.
Reproducible Machine Learning

Details

Hi Everyone,
Agenda for next virtual event:
11:30 am - 11:45 Introductions/Meet and greet
11:45 - 12:45: Reproducible ML experiments (with Git & DVC) by Milecia
12:50 - 1:30: AIQC for rapid, rigorous, & reproducible deep learning by Layne
1:30 - 1:45: Final thoughts

First talk by Milecia McGregor
Bio:
Milecia is a senior software engineer, international tech speaker, and mad scientist that works with hardware and software. She will try to make anything with JavaScript first. In her free time, she enjoys learning random things, like how to ride a unicycle, and playing with her dog.

Discussion: "Reproducible ML experiments (with Git & DVC)"
In this workshop, you will learn how you can use the open-source tool, DVC, to compare training metrics using two methods for tuning hyperparameters: grid search and random search. You'll learn how you can save and track the changes in your data, code, and metrics without adding a lot of commits to your Git history. This approach will scale with your data and projects and make sure that your team can reproduce results easily.
https://github.com/iterative/dvc

Second talk by Layne Sadler
Bio:
An autodidact at heart, Layne began on the business side of technology, but curiosity led him to build his own apps and algorithms. While working as a product management lead with pharma and research institutes on national genomic biobank projects, he observed barriers that prevented the adoption of deep learning in scientific research. So he built AIQC to address those problems. He also helps out with product research at project Jupyter.
https://twitter.com/LayneSadler
https://www.linkedin.com/in/laynesadler/

Discussion: "AIQC for rapid, rigorous, & reproducible deep learning"
In this 30min talk, we’ll provide an overview and demo of the AIQC framework, https://github.com/aiqc/AIQC

  1. The need for deep learning in scientific research.
    AIQC framework components for best practice preprocessing, experiment tracking & evaluation, and total reproducibility.

  2. Using the AIQC high-level API to rapidly train and evaluate models for (a) regression of Kepler satellite data to predict temperatures of exoplanets, and (b) binary classification of MRI imaging to detect brain tumors.

  3. Chronic preprocessing challenges: data leakage, evaluation bias, partial reproducibility.

Note: Date/Time of discussion might change.

Regards,
Pramit

Photo of PyData SoCal group
PyData SoCal
See more events