After the summer break we are resuming our meet-ups with an outstanding line up of speakers from google who will be talking about integrating R with google data storage technologies, doing statistical analysis on large data sets without MapReduce and writing efficient R code.
6:30 Pizza* and networking
7:10 Sundar Dorai-Raj and Phillip Yelland
7:40 Karl Millar
8:10 Olivia Lau
Title: "Statistical Analysis at Google-scale with R"
Presenters: Sundar Dorai-Raj and Phillip Yelland
Abstract: R is an open source programming language used primarily for statistical analysis and data visualization. It is being increasingly used by statisticians, economists, and engineers at Google. Learn how R has been integrated with common google data storage technologies such as Dremel, Bigtable, and ColumnIO and can be used for statistical
modeling, forecasting, clustering, and other statistical computations on Google data.
Title: "Analyzing Large Datasets at Google"
Presenter: Karl Millar
Analyzing large datasets requires statistical software that scales to data sizes several orders of magnitude larger than R is currently capable of handling. Tools such as MapReduce and Hadoop are capable of scaling to such large data sizes but are impractical for statisticians to use for data analysis.
At Google, we are building packages based on FlumeJava that make it easier for our analysts to work efficiently with large amounts of data without having to deal with the details of MapReduce. I'll discuss the overall design and API of these packages and how they're able to provide a simple, familiar programming model for data analysis while providing both scalability and performance.
Title: "3 Tips for Better R Code"
Presenter: Olivia Lau
This talk illustrates 3 principles for writing easier to use, faster, and more efficient R code.
* Pizza and soft drinks will provided by our google hosts