This meeting we'll talk about modeling with 3 speakers on 3 different packages. We'll also have plenty of time for discussions, and this meeting will also be a good opportunity for anyone to jump in and share some thoughts about uses of these or related/similar/other modeling packages.
The packages discussed are:
randomForests: Random forests offer a very powerful technique for statistical learning. This talk will introduce random forests by reference to the CART (Classification And Regression Trees) algorithm. It will provide some guidelines about when random forests are likely to be useful. A few key strengths of random forests will be reviewed, such as the ability to create accurate forecasts and to determine the importance of predictors. A small example will be shown to demonstrate the use of the randomForest package in R.
lme4: lme4 is an R package for random effects modeling, and currently one of the most popular packages on CRAN. lme4 supports both hierarchical and crossed data and many distribution families, making it also a suitable tool for example for IRT and repeated measures data.
caret: The package caret provides a unified interface (wrapper) to the myriad of supervised learning (classification and regression) functions and packages in R, such as for example rpart (decision trees), nnet (neural nets), linear models, boosting, random forests, support vector machines etc. It also provides an easy-to-use framework for tuning/optimizing the complexity/regularization parameters of the various models with evaluation of the models based on cross validation and similar techniques. Finally, it has tools for data preparation, model diagnostics (variable importance, ROC) etc.
Eric Kostello, PhD, is the Director of Product Development in the Global Product Strategy and Integration group of The Nielsen Company. He is currently working with Nielsen/NRG, which specializes in research for movie studios. He has extensive experience working with survey data and has conducted investigations into topics ranging from user generated content metrics to the transition to market economies in Eastern Europe.
Jeroen Ooms is a visiting scholar from the Netherlands at UCLA Dept. of Statistics, who graduated last year for his masters at Utrecht University. He uses R especially for developing web applications, as seen on his website: http://www.stat.ucla....
Szilard Pafka, PhD is Chief Scientist at Epoch, a credit card transactions processor. He uses R for various data analysis, statistical modeling and visualization projects. Previously, he worked on a variety of research problems ranging from physics to financial prices.
We should have plenty of time (and cookies) for more discussions and networking after the meeting.
Please RSVP as places are limited.
Venue and starting time as usual:
UCLA Boelter Hall Room 9413 (the lab), 6:30pm
(for those coming for the first time please consult the detailed venue information given at the first meeting)