This event has passed
Speakers: Dr. Nina Zumel and Dr. John Mount
Title: Advanced Data Preparation for Supervised Machine Learning
Bios + Abstract: Below
40% discount code for Practical Data Science with R book: mtpwhyr20 https://www.manning.com/books/practical-data-science-with-r-second-edition
- dontate: http://whyr.pl/donate/
- channel: youtube.com/c/WhyRFoundation
- date: every Thursday 8:00 pm GMT+2 (starting April 2nd!)
- format: one 45 minutes long stream + 10 minutes for Q&A
- comments: ask questions on YouTube live chat
- join Why R? Slack whyr.pl/slack/
Brief abstract: Dr. Nina Zumel and Dr. John Mount will present methods for advanced data preparation for supervised machine learning. In particular we will show how to safely pre-process high cardinality categorical variables for later use. We will spend time on the important points of cross or out of sample methods to reduce over-fit. We will work theory and examples, and show how the vtreat package can be used in projects. We wil also preview chapter 8 of Practical Data Science with R, 2nd Edition: Advanced Data Preparation.
Nina Zumel is a Principal Consultant with Win-Vector, LLC, a data science consultancy in San Francisco. She has a Ph.D. in robotics from Carnegie Mellon and is one of the authors of Practical Data Science with R, a popular text on data science.
John Mount is a Principal Consultant with Win-Vector LLC, and co-author of "Practical Data Science with R, 2nd Edition", Manning 2019. He has a Ph.D. in computer science from Carnegie Mellon
Both John and Nina maintain a number of open source R and Python packages for data science