"Official" BARUG October 2013 Meeting at DataWeek


Details
This meeting is being held in conjunction with the Data Week (http://dataweek.co/) Conference. The format will be a series of seven, 12 minute "lightning" talks.
Agenda:
6:40 pm - Announcements
6:45 pm - Timothy Sweetser: R Analysis of Bart Fares
7:00 pm - Utham Kamath: A platform for collaborative apps
7:15 pm - Murray Stokely: Histogram tools for large data sets
7:30 pm - Clark Fitzgerald: Using R on Amazon's EC2
7:45 pm - Elaine Jones: Moving from SAS to R in a DB2 App
8:00 pm - Mathias Brandewinder: R and F# integration
8:15 pm - Harrison Decker: R and Reproducible Research
Note that the meeting will start at 6:40, pizza should be ready by 6:15 but please come to the Data Week expo hall early for the Big Data Beer Tasting (http://bigdatabeertasting.com/) (4:30 PM - 7:00 PM)
Please sign up for a free pass to the DataWeek 2013 Expo here. (http://dataweek13.eventbrite.com/?discount=dataweekexpo)
More details on the talks follow.
Timothy Sweetser
R was used for an OLS analyze BART fares, a topic germane to Bay Area residents. The talk will also include a comparison with DC Metrorail comparison.
Utham Kamath
will speak on a new platform for collaborative analytic apps which is based on several open source languages, R being used for statistical computation.
Murray Stokely:
HistogramTools for Distributions of Large Data Sets: This lightning talk presents a new R package with a number of operations to augment the built-in support for Histograms in R. Specifically, methods are included for serializing histograms into a compact Protocol Buffer representation for sharing between distributed tasks, functions for manipulating the resulting aggregate histograms, and functions for measuring and visualizing the information loss associated with histogram representations of a data set. Example applications from my research on distributed storage systems at Google will also be presented.
Clark Fitzgerald
will speak on using R on virtual machines through Amazon's Elastic Compute (EC2) cloud and demonstrate that you can economically rent and utilize a high performance machine in a virtual environment.
Elaine Jones
will speak on how IBM replaced an expensive SAS group license with R to do a number of tasks relating to extracting raw data from DB2, summarizing, formatting and loading it into a different DB2 database in order to support an automated process analytics system for storage products manufacturing.
Mathias Brandewinder
will present is the F# R Type Provider (https://github.com/BlueMountainCapital/FSharpRProvider): F# is a functional language on the Microsoft .NET stack (somewhere between Scala and Python). F# has a unique feature, Type Providers, a "bridge" mechanism to data and external resources, and the community has developed a Type Provider to R, which opens interesting scenarios. This allows to leverage the best of both worlds - I can leverage the power of R and its packages from within the F# environment, and I can use F# and its type system, and integrate F# code targeting R into production code. The owner of the project uses it on a daily basis in a hedge fun, to integrate C#, F# and R code.
Harrison Decker
This Snapshot will explore how the principles of reproducible research are helping drive development in the R community. It will also identify and discuss the functionality of specific R packages pertinent to the digital library community and where opportunities for collaboration exist.

Sponsors
"Official" BARUG October 2013 Meeting at DataWeek