Ensemble methods are generally agreed to be the closest thing to a cure-all in machine learning. Google uses ensemble methods for determining what ads to place on pages and eBay uses them for determining what search results to return. Ensemble methods combine a large number of mediocre machine learning algorithms into a single answer. When done properly this results in a algorithm that resists over-training and that delivers top performance on a wide variety of problems - winning a majority of Kaggle competitions for example.
This class will start with the basics of trees and will cover the background, usage, strengths and weaknesses of the major ensemble algorithms. By the end of class attendees will understand when these algorithms are applicable and how to get the best performance from them. The class will meet for 4 hours on 4 Saturdays. Here are the topics we'll cover.
1. Decision Trees, Boosting
2. Gradient Boosting
3. Random Forests
4. Topics of Interest in Machine Learning (e.g. active learning, low-rank matrix approx for recommender, expectation maximization algo)
If attendees are interested we can schedule a session for projects or machine learning competition.
The class is intended for computer programmers. No prior knowledge of machine learning is assumed. The course will primarily use R statistical language. There will be a separate review of R language for those requiring it. The class will include derivations that require undergrad level math - calculus and linear algebra.
Sept 14, Sept 21, Sept 28, Oct 5 - One payment covers all 4 sessions.
Early Bird Registration
There's a $100 discount if you register and pay 5 days before class starts. You can pay through Eventbrite datascience202.eventbrite.com or through paypal (mike at mbowles dot com).
Attend by web conference
Sessions will be webcast. If you'd like to attend by webcast, be sure to sign up at least 12 hours before the start of class in order to receive instructions and passwords.