Kaggler talk series: Top 0.2% Kaggler on Amazon Employee Access Challenge


Details
This is a step by step intensive 2 hours demonstration session of model building which focus on a ongoing kaggle competition.
08/12/2013 Mon Update
Knewton just notified me that I can open up more than 120 spots. If you were on waiting list and still interested to come to tonight's talk, you can sign up now! See you tonight!
07/10/2013 Wed Update
I hope to find a space to contain 120+ people. This talk is very precious and extremely helpful if you are in the Data Science field or interested to get into this field.
Please let me know if you can sponsor me such space!
------------------------------------------------
Speaker Bio: Yibo Chen is a data analyst with experience in model building such as response model in CRM and credit score card. Recently he is interested in Kaggle's competitions. After participating in some of these competitions, he has learned some knowledge about data mining, and also get a score not very bad(currently 231st of 104993 data scientists).
Agenda:
7:00-9:00 introduce our solution to the Amazon Employee Access Challenge
Content:
-
Feature engineering(extraction and selection).
-
Modeling techniques
use classifiers including Gradient Boosting Machine, Random Forest, Regularized Generalized Linear Models and Support Vector Machine)
- Ensembles.
use stacking based on 5-fold cv for combining predictions of the base learners. The software we use is R(2.15.1) and some add-on packages including gbm,randomForest,glmnet,kernlab and Matrix.
Reference: The Kaggle competition The Hewlett Foundation: Short Answer Scoring and the winners' solutions. http://www.kaggle.com/c/asap-sas/details/winners

Kaggler talk series: Top 0.2% Kaggler on Amazon Employee Access Challenge