Cleaning Data for Effective Data Science: Explorations in Anomaly Detection


Details
~~~~~~ Cleaning Data for Effective Data Science ~~~~~~
~ ~ ~ ~ ~ ~ Explorations in Anomaly Detection ~ ~ ~ ~ ~ ~
David Mertz will be talking to us about cleaning data - the topic of his latest book 'Cleaning Data for Effective Data Science' (https://gnosis.cx/cleaning/).
He'll start with a general roadmap of data cleaning (what we do in that first 80% of our data jobs!) and then delve into more detail around anomaly detection.
============
David Mertz
David was a Director of the PSF for six years, and remains co-chair of its Trademarks Committee, its Python-Cuba working group, and of the Scientific Python Working Group.
He wrote the columns, 'Charming Python' and 'XML Matters' for IBM developerWorks, short books for O’Reilly, and the Addison-Wesley book 'Text Processing in Python', has spoken at multiple OSCon’s, PyCon’s, and at AnacondaCon, and was invited keynote speaker at PyCon-India,
PyCon-UK, PyCon-ZA, PyCon Belarus, PyCon Cuba, PyData SF, and PiterPy (Russia).
David created the data science training program for Anaconda Inc. and was a senior trainer for them. Before that, he worked for 8 years with the folks (D. E. Shaw Research) who have built the world’s fastest, highly-specialized (down to the ASICs and network layer), supercomputer for performing molecular dynamics. Nowadays he is a mostly a data scientist, and teaches our robot overlords to be perceptive.
He is pleased to find Python has become the default high-level language for most scientific computing projects.

Cleaning Data for Effective Data Science: Explorations in Anomaly Detection