PyLadies Melbourne is a Python programming group that welcomes women and genderqueer/non-binary individuals across Melbourne, Australia. PyLadies aims to provide a friendly peer network for women and GQ/NB people to share their interest in Python programming.
Data Cleaning... But Why? - An introduction to Data Cleaning & Pre-processing for Machine Learning
Messy data is not fun. It's not fun to deal with, it's not fun to clean, but most of all it's negative effects on your ML projects are definitely not fun. In this talk, watch me try to make a really dry topic fun and interesting, and give you some ideas on minimizing the amount of time you spend on the "not fun" stuff, so you'll have more time to spend on the best parts of working with data!
This talk will cover some basic topics that affect everyone working with data and should be helpful for ML newbies and veterans alike.
I'll be going through some tips about how to clean and prepare your data for Machine Learning and giving some explanation of some of the common terms you hear.
I'll be covering things like feature selection, handling different kinds of data discrepancies, explaining big ol' words like multicollinearity, and more!
This talk was first given at Melbourne Women in Machine Learning & Data Science, presented now with bonus Python!
Brooke Clarke is a Data Analyst at EA Firemonkeys Studio in Melbourne, and is excited about all things data! You can read more about her Data Science journey on her blog, girlvsdata.com, or follow her on Twitter @girlvsdata.
Note: PyLadies monthly meetings are for women and genderqueer/non-binary individuals. Thank you for abiding by the group's intent.