Double Header with John Myles White and Tal Galili


Details
We have an amazing two-for this month, John Myles White and Tal Galili.
First up we have Tal:
Creating beautiful trees of clusterings with R (+a bonus)
Tal Galili (http://r-statistics.com/), the founder of R-bloggers (http://r-bloggers.com) (and also a PhD student of statistics), will present his recent work on the package "dendextend (http://github.com/talgalili/dendextend)" - intended for visualizing and comparing trees of heirarchical clusterings (a.k.a: dendrograms) with R.
There will be a short overview of the "dendrogram" object in R and its manipulation with the "dendextend" package. From this talk you will know how to create, change, visualize, and statistically compare two trees of heirarchical clusterings (with some sprinkles of Rcpp).
The "bonus talk" in this meetup will be the 5-minutes lightining talk Tal gave in the recent useR!2013 conference about how to quickly update R on windows/mac, using the 'installr' package.
Then we have John:
Streaming Data Analysis and Online Learning
John Myles White is one of the primary developers of Julia, a new language for technical computing. John is currently developing the statistical and machine learning infrastructure for Julia. In addition, he is one of the residents at Hacker School's Summer 2013 program. John recently finished his Ph.D. at Princeton, where he developed models of human decision-making. During grad school, John co-wrote Machine Learning for Hackers and Bandit Algorithms for Website. Starting in the fall, John will be a research scientist at Facebook.
Traditional statistical software suites are specialized for analyzing data sets that can fit in RAM. But many modern applications require that we analyze much larger data sets. In this talk, I'll survey some of the basic methods for analyzing data in a streaming manner. I'll focus on using stochastic gradient descent (SGD) to fit models to data sets that arrive in small chunks. I'll discus some basic implementation issues and demonstrate the effectiveness of SGD for problems like linear and logistic regression as well as matrix factorization. I'll also describe how these methods allow ML systems to adapt to user data in real-time.
Pizza begins at 6:15, the speakers at 7 then we'll head to the bar whenever we finish.
Please arrive early to grab a seat. We understand space is an issue and we are doing our best to handle the situation. Please help us out by RSVPing as soon as you can.

Double Header with John Myles White and Tal Galili