addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupsimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1outlookpersonJoin Group on CardStartprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

"Official"­ April 2014 Meetup


6:30 -   Pizza and networking
7:00 - Announcements
7:05 - Talks begin

Each speaker will have 12 minutes with 3 minutes for Q&A

1. SriSatish Ambati and Anqi Fu: Scalable in-memory ddply() , randomForest, gbm with library(h2o) on Hadoop
2. Gaston Sanchez: Creating Arc Diagrams with R
3. Winston Chen:  Data Analysis with RStudio and MongoDB
4. Raman Kapur: Managing Enterprise Cyber Risk by Leveraging Big Data Analytics
5. Ram Narasimhan: The weatherData package
6. Sara Brumbaugh: Running R from Excel through VBA
7. Giovanni Seni: The REgo package for Rule Ensembles


Scalable in-memory ddply() , randomForest, gbm with library(h2o) on Hadoop

This lightning talk highlights an easy way to run R on Hadoop with H2O.  Users write regular single threaded ddply code, 'magic' happens and it runs parallel & distributed on multiple machines.  With or without hadoop.  A short demo & architecture of the efficient compressed Distributed Frames and fast execution framework.

SriSatish Ambati

Sri is co-founder and ceo of 0xdata (@hexadata), the builders of H2O. H2O democratizes bigdata science and makes hadoop do math for better predictions. Before 0xdata, Sri spent time scaling R over bigdata with researchers at Purdue and Stanford. Prior to that Sri co-founded Platfora and was the Director of Engineering at DataStax. Before that Sri was Partner & Performance engineer at java multi-core startup, Azul Systems, tinkering with the entire ecosystem of enterprise apps at scale.

Anqi Fu

Anqi is a data hacker at 0xdata. And works on bringing seamless R experience on Big Data. Anqi earned her master's degree in Economics, and a second master's degree in Statistics at Stanford and Computer Science degree from Maryland. Her interests include machine learning and optimization.


Creating Arc Diagrams with R
An arc diagram is another way of representing a two-dimensional graph. The nodes are arranged along a horizontal (or vertical) axis, and the edges between the nodes are displayed as arcs. Inspired by the “Similar Diversity” arc diagram (by Steinweber and Koller), Gaston will describe the process he went through with R in order to emulate a Similar Diversity arc diagram using the movie scripts of the Star Wars original trilogy. 

Gaston Sanchez, PhD
Gaston is a statistical programmer working on multivariate methods for analyzing multiblock data, and data visualization approaches with dimension reduction techniques. He is an enthusiast useR and author of several R packages (e.g. `plspm`, `plsdepot`, `tester`).  Currently, he is a guest researcher in the Nielsen Group at UC Berkeley.


Building up an easy data analysis platform with RStudio server on top of your MongoDB

Winston Chen


"weatherData" is an R package that can help get weather-related data with timestamps from the Web in easy to use data frames. In this short talk, we will look at some of the enhancements in the newest release of the package. We will also look at a Shiny application that makes use of the data. 

Ram Narasimhan is a Bay-Area based operations researcher who works on logistics problems


Managing Enterprise Cyber Risk by Leveraging Big Data and Analytics

Current conventional security tools are unable to effectively prevent increasing levels of Cyber Attacks from succeeding.

The missing piece is that Organizations do not have a correlated enterprise wide view of their Cyber Security Risk. We will highlight how Organizations can leverage the power of Big Data and Analytics using R to locate and measure Enterprise Cyber Security Risk, and consequently prevent these Risks in a prioritized manner.

Raman Kapur


Running R from Excel through VBA:  Turning your Old Scripts into Interactive Tools

R scripting can combine with Excel's macro language to pass inputs and parameters from a worksheet to a batch process, and back again.  Storing the script in a hidden worksheet makes the work even easier.  Examples are given, and addition of an Excel custom menu is demonstrated.  

Sara Brumbaugh



an open-source contribution that "provides a command-line batch interface to the RuleFit statistical model building program. RuleFit refers to Professor Jerome Friedman's implementation of Rule Ensembles.”

Giovanni Seni

Join or login to comment.

  • Daniel F.

    ... just saw this item of interest regarding the future of big data in-the-making:
    policies pertaining to privacy (see also considerations on machine learning: "how do you like your toast?")

    April 9, 2014

  • Daniel F.

    Lovely presentations - nice to see a glimpse of H20 and I was impressed by the timely mix of all the subjects - especially as Star Wars seven is now filming under Disney's Lucasfilm umbrella - sorry to miss the VBA migration talk and raffle. Good location.

    2 · April 9, 2014

  • A former member
    A former member

    I realized during Raman's talk that I look forward to winning arguments with my future spouse with quantitative R-stats.

    1 · April 9, 2014

    • Kay K.

      You think you can win any arguments with a spouse?

      1 · April 9, 2014

    • HansW

      I can only say that ram is lucky that his wife can be persuaded with data :-)

      2 · April 9, 2014

  • Elaine J.

    Raman Kapur's talk was very timely in light of the latest cyber security breach!

    April 9, 2014

  • Gurudev K.

    Good event. Thank You!

    BTW, I accidentally forgot my power supply at the event. Who would be the right person to contact to get it back?

    April 9, 2014

  • John M.

    Really enjoyed all the short presentations, great meetup.

    April 9, 2014

  • NH

    Best meetup

    April 9, 2014

  • John-Mark A.

    I'm looking for "weatherData 0.4" on CRAN - The most current one there is 0.3. Is there another place to look for the most current?

    April 9, 2014

  • John-Mark A.

    Short presentations are a great idea! Lets do more of them.

    April 9, 2014

  • Angela H.

    Enjoyed all the speakers - really fun speaker on weather package.

    April 8, 2014

  • SAI C.

    Anyone carpooling from sfo?

    April 8, 2014

  • Jonathan Y.

    Will there be a video?

    April 7, 2014

  • Spencer G.

    I can give a ride from Intuit to the Rose Garden district of San José (roughly a mile south of the San José International Airport) or to any point more or less in between. Spencer Graves

    April 7, 2014

  • Daniel F.

    Casual carpool afterwards?

    April 7, 2014

  • Spencer G.

    Do you use R to produce videos? What tools do you use? I'm using readImage{EBImage} to read jpg files and rasterImage{graphics} to place them in a plot. I've created png files and married them with an audio using FFmpeg. This works, but I'm looking for something more efficient. findFn{sos} found "animate" and other packages, but I so far haven't found them useful. Thanks, Spencer Graves

    April 7, 2014

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy