Skip to content

Tutorial: Best Testing Practices for Data Science

Photo of Eric Ma
Hosted By
Eric M.
Tutorial: Best Testing Practices for Data Science

Details

Description

So you're a data scientist wrangling with data that's continually avalanching in, and there's always errors cropping up! NaNs, strings where there are supposed to be integers, and more. Moreover, your team is writing code that is getting reused, but that code is failing in mysterious places. How do you solve this? Testing is the answer! In this tutorial, you will gain practical hands-on experience writing tests in a data science setting so that you can continually ensure the integrity of your code and data. You will learn how to use py.test, coverage.py, and hypothesis to write better tests for your code.

Instructor Bio

Eric Ma is a 6th year PhD Candidate in the Runstadler Lab in the Biological Engineering department at MIT. I study the influenza virus, which is like a self-replicating deck of 8 poker cards. I am using Python to solve infectious disease data science problems.

Pre-Tutorial Instructions

Please follow instructions on the GitHub repository: https://github.com/ericmjl/data-testing-tutorial

Other Notes

Food will not be provided, as we do not have sponsors for the event. Lunch options nearby in the Kendall/MIT area include Au Bon Pain, Chipotle, Clover, Champions, and more.

Photo of The Boston Python User Group group
The Boston Python User Group
See more events