addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

"Official"­ BARUG September Meeting: Google R user talks

After the summer break we are resuming our meet-ups with an outstanding line up of speakers from google who will be talking about integrating R with google data storage technologies, doing statistical analysis on large data sets without MapReduce and writing efficient R code.

6:30 Pizza* and networking
7:00 Announcements
7:10 Sundar Dorai-Raj and Phillip Yelland
7:40 Karl Millar
8:10 Olivia Lau

Talk Abstracts:

Title: "Statistical Analysis at Google-scale with R"
Presenters:  Sundar Dorai-Raj and Phillip Yelland

Abstract: R is an open source programming language used primarily for statistical analysis and data visualization.  It is being increasingly used by statisticians, economists, and engineers at Google.  Learn how R has been integrated with common google data storage technologies such as Dremel, Bigtable, and ColumnIO and can be used for statistical
modeling, forecasting, clustering, and other statistical computations on Google data.

Title:  "Analyzing Large Datasets at Google"
Presenter:  Karl Millar

Analyzing large datasets requires statistical software that scales to data sizes several orders of magnitude larger than R is currently capable of handling. Tools such as MapReduce and Hadoop are capable of scaling to such large data sizes but are impractical for statisticians to use for data analysis.

At Google, we are building packages based on FlumeJava that make it easier for our analysts to work efficiently with large amounts of data without having to deal with the details of MapReduce.  I'll discuss the overall design and API of these packages and how they're able to provide a simple, familiar programming model for data analysis while providing both scalability and performance.

Title:  "3 Tips for Better R Code"
Presenter:  Olivia Lau

This talk illustrates 3 principles for writing easier to use, faster, and more efficient R code.


* Pizza and soft drinks will provided by our google hosts

Join or login to comment.

  • Dmitriy L.

    Hello. I am waiting for the slides too! Is there any way to publish them (especially Karl's slides with examples of R Flume API?). I would be eternally grateful. thanks.

    1 · September 18, 2012

  • luba

    Thank you all for joining us Tuesday night at Google! Thanks to Google for hosting & thanks to those that shared their Lightning Talk ideas with me after the presentations. We're getting some great topics lined up for the October Meetup! If you'd like to signing up as a presenter, please send me an e-mail with your topic at [masked]

    September 13, 2012

  • A former member
    A former member

    If somebody could provide s brief summary what the talks were about, it would help.. meanwhile I would wait for the slides.. Thanks!

    September 12, 2012

    • Ferris J.

      The first speaker went over how they converted R into a core tech at Google. The second discussed various methods for interfacing R with distributed systems for analyzing large data sets. The goal was to have R-like code for map-reduce based systems (read not parallelizing R itself). The final speaker gave tips for optimizing R on a single machine, and how this can often remedy quite a bit.

      September 12, 2012

    • A former member
      A former member

      Thanks very much Jumah.. Any chance these were recorded?

      September 12, 2012

  • JiaHuangLiu

    Looking forward more detail about distributed statistical computing.

    September 12, 2012

  • Gary M.

    Interesting talks - I particularly liked the first speaker's comments on the role of graphics in analysis. The final presentation didn't fit the theme of how R is used at Google (I thought there was a theme; maybe I'm mistaken) and would have been twice as good at half the speed with half the content (the most salient half, that is) A for effort, and thank you.

    September 11, 2012

  • Madana B.

    Nice talks

    September 11, 2012

  • Robert F.

    Great job! Very insightful.

    September 11, 2012

  • Srikanth V.

    Would the slides be available?

    September 11, 2012

  • Spencer A.

    Interesting! Flume really needs to be pushed to the community...

    September 11, 2012

  • Rao V.

    I enjoyed the first two lectures. The third one was a bit too specialized, went too fast, and I was too tired by then.

    September 11, 2012

  • Madana B.

    Good talks

    September 11, 2012

  • Antonio P.

    Carpool leaving from Rockridge Bart at 5:30, two seats available. Can talk about R and Hadoop as a bonus.

    September 10, 2012

    • Antonio P.

      With no answer from David I am going to call 5:30 final. Thanks

      September 11, 2012

    • allan m.

      Hi Antonio. Confirmed for 5:30. See you then. If anything changes, you can send me a text at (510) 326-1486. Thanks.

      September 11, 2012

  • Pavel M.

    Any chance of getting a hold of a recording of this meetup or the presentations?

    September 11, 2012

  • A former member
    A former member

    How good/bad is the chance that they open up the event to those on the waitlist?

    September 10, 2012

  • Jiunjiun M.

    Due to a schedule conflict, I may not be able to attend . Can the session be recorded and made available later? Thanks.

    1 · September 6, 2012

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy