Location visible to members
Topic: Random Forests, Theory and Practice
Speaker: Michael Zimmer
"Random Forests" is a powerful, off-the-shelf, machine learning algorithm for classification and regression. In this "journal club" style talk we'll examine the origins of the method, touching on ensemble methods, bagging and bootstrapping. Following that, we'll look at an implementation of Random Forests in R and its application to a recent Kaggle competition (Bond Pricing), as well as other examples.
random forests, decision trees, bootstrapping, bagging, aggregating, ensemble methods
A clear, brief description for using random forests in R by the authors of the R package. A good place to start:
A nice talk by Hemant Ishwaran on Random Forests (start at 2:10 to avoid intro).
Part 2: I suggest you watch up until 12:15
This is mentioned in the above video. He describes "Bagging", which is the starting point for the Random Forest algorithm:
In the talk, we'll look at the Kaggle competition on bond pricing. In the Data section, be sure to look at the R code for the random forest benchmark (called "random_forest_benchmark.r").
A description of the randomForest package in R, which comes in handy
If you seek more extensive information, here's the website of the author of the Random Forest algorithm (Leo Breiman)
Michael Zimmer, PhD is a programmer/consultant with a background in science and an interest in machine learning.