Skip to content

Details

Description

So you're a data scientist wrangling with data that's continually avalanching in, and there's always errors cropping up! NaNs, strings where there are supposed to be integers, and more. Moreover, your team is writing code that is getting reused, but that code is failing in mysterious places. How do you solve this? Testing is the answer! In this tutorial, you will gain practical hands-on experience writing tests in a data science setting so that you can continually ensure the integrity of your code and data. You will learn how to use py.test, coverage.py, and hypothesis to write better tests for your code.

Instructor Bio

Eric Ma is a 6th year PhD Candidate in the Runstadler Lab in the Biological Engineering department at MIT. I study the influenza virus, which is like a self-replicating deck of 8 poker cards. I am using Python to solve infectious disease data science problems.

Pre-Tutorial Instructions

Please follow instructions on the GitHub repository: https://github.com/ericmjl/data-testing-tutorial

Other Notes

Food will not be provided, as we do not have sponsors for the event. Lunch options nearby in the Kendall/MIT area include Au Bon Pain, Chipotle, Clover, Champions, and more.

Related topics

Sponsors

Matterbeam

Matterbeam

Sponsor of the Jan 21 presentation night

Temporal

Temporal

Temporal sponsors our May 8th PyCon presentation rehearsals

Cambridge Mobile Telematics

Cambridge Mobile Telematics

CMT has sponsored Presentation Night

DataDog

DataDog

DataDog is a regular host and sponsor of our in-person events

You may also like