Jeremy Howard founded FastMail.FM (sold to Opera software in 2010) and Optimal Decisions Group (sold to Choicepoint/LexisNexis in 2008). After selling FastMail, he became interested in data mining competitions, and entered some of the competitions at Kaggle, where he has had a number of good results, including:
- Tourism time-series forecasting (team with Lee Baker): Winner
- University grant prediction: Winner
- Chess ratings: 2nd place
- IJCNN social network challenge: 4th place
Jeremy liked Kaggle so much, he joined the company! He is now Kaggle's Chief Data Scientist. (Kaggle is the company running the $3m Heritage Health Prize.)
In this talk, Jeremy will provides tips on how to successfully compete, and show how he combines R with other tools to build predictive models. He will provide a walkthrough of the data, visualizations, and code, for a number of his competition entries.
The talk will also include an introduction to the theory behind Jeremy's favorite modelling algorithm: random forests. He guarantees that by the end of the talk everybody, regardless of their technical background, will understand exactly how random forests work!