Past Meetup

"Official" BARUG October 2013 Meeting at DataWeek

This Meetup is past

217 people went

Fort Mason Center

2 Marina Blvd (at Buchanan St.) · San Francisco, CA

How to find us

Meeting Location: Dev Lounge in the Expo Hall which is in Fort Mason Center's Festival Pavilion

Location image of event venue

Details

This meeting is being held in conjunction with the Data Week (http://dataweek.co/) Conference. The format will be a series of seven, 12 minute "lightning" talks.

Agenda:
6:40 pm - Announcements
6:45 pm - Timothy Sweetser: R Analysis of Bart Fares
7:00 pm - Utham Kamath: A platform for collaborative apps
7:15 pm - Murray Stokely: Histogram tools for large data sets
7:30 pm - Clark Fitzgerald: Using R on Amazon's EC2
7:45 pm - Elaine Jones: Moving from SAS to R in a DB2 App
8:00 pm - Mathias Brandewinder: R and F# integration
8:15 pm - Harrison Decker: R and Reproducible Research

Note that the meeting will start at 6:40, pizza should be ready by 6:15 but please come to the Data Week expo hall early for the Big Data Beer Tasting (http://bigdatabeertasting.com/) (4:30 PM - 7:00 PM)

Please sign up for a free pass to the DataWeek 2013 Expo here. (http://dataweek13.eventbrite.com/?discount=dataweekexpo)

More details on the talks follow.

Timothy Sweetser
R was used for an OLS analyze BART fares, a topic germane to Bay Area residents. The talk will also include a comparison with DC Metrorail comparison.

Utham Kamath
will speak on a new platform for collaborative analytic apps which is based on several open source languages, R being used for statistical computation.

Murray Stokely:
HistogramTools for Distributions of Large Data Sets: This lightning talk presents a new R package with a number of operations to augment the built-in support for Histograms in R. Specifically, methods are included for serializing histograms into a compact Protocol Buffer representation for sharing between distributed tasks, functions for manipulating the resulting aggregate histograms, and functions for measuring and visualizing the information loss associated with histogram representations of a data set. Example applications from my research on distributed storage systems at Google will also be presented.

Clark Fitzgerald
will speak on using R on virtual machines through Amazon's Elastic Compute (EC2) cloud and demonstrate that you can economically rent and utilize a high performance machine in a virtual environment.

Elaine Jones
will speak on how IBM replaced an expensive SAS group license with R to do a number of tasks relating to extracting raw data from DB2, summarizing, formatting and loading it into a different DB2 database in order to support an automated process analytics system for storage products manufacturing.

Mathias Brandewinder
will present is the F# R Type Provider (https://github.com/BlueMountainCapital/FSharpRProvider): F# is a functional language on the Microsoft .NET stack (somewhere between Scala and Python). F# has a unique feature, Type Providers, a "bridge" mechanism to data and external resources, and the community has developed a Type Provider to R, which opens interesting scenarios. This allows to leverage the best of both worlds - I can leverage the power of R and its packages from within the F# environment, and I can use F# and its type system, and integrate F# code targeting R into production code. The owner of the project uses it on a daily basis in a hedge fun, to integrate C#, F# and R code.

Harrison Decker
This Snapshot will explore how the principles of reproducible research are helping drive development in the R community. It will also identify and discuss the functionality of specific R packages pertinent to the digital library community and where opportunities for collaboration exist.