Skip to content

Details

Speakers: Dr. Nina Zumel and Dr. John Mount
Title: Advanced Data Preparation for Supervised Machine Learning
Bios + Abstract: Below
Stream: https://youtu.be/sniHkkrAsOc
40% discount code for Practical Data Science with R book: mtpwhyr20 https://www.manning.com/books/practical-data-science-with-r-second-edition

Brief abstract: Dr. Nina Zumel and Dr. John Mount will present methods for advanced data preparation for supervised machine learning. In particular we will show how to safely pre-process high cardinality categorical variables for later use. We will spend time on the important points of cross or out of sample methods to reduce over-fit. We will work theory and examples, and show how the vtreat package can be used in projects. We wil also preview chapter 8 of Practical Data Science with R, 2nd Edition: Advanced Data Preparation.

Biograms:

Nina Zumel is a Principal Consultant with Win-Vector, LLC, a data science consultancy in San Francisco. She has a Ph.D. in robotics from Carnegie Mellon and is one of the authors of Practical Data Science with R, a popular text on data science.

John Mount is a Principal Consultant with Win-Vector LLC, and co-author of "Practical Data Science with R, 2nd Edition", Manning 2019. He has a Ph.D. in computer science from Carnegie Mellon

Both John and Nina maintain a number of open source R and Python packages for data science

Members are also interested in