Skip to content

Details

We are very excited to have our first Meetup combining the R and Big Data communities in Denver! Note there have been a number of comments requesting more social/networking time so we have adjusted the agenda accordingly. Any other recommendations on how to improve the meetup are always welcome.

Hope you can make it!

Agenda

6:30 - 7:30 - Socialize over food and drink 7:30 - 7:40 - Announcements 7:40 - 8:15 - R and Hadoop presentation 8:15 - ?:?? - Continued Socializing.

Presentations

Using R with Hadoop - Nathan will present an example of using R to write a Hadoop streaming job for mapping scores to vectors of data. We tested on 34.5M records stored on the Hadoop cluster.

This talk will cover:

  • setting up a Hadoop streaming job
  • using Rscript to run R on the command line
  • writing R code that reads from stdin
  • adjusting the number of mappers to change performance
  • our optimizations to the R code so far
  • our plans for future tests

About the speakers:

Nathan McIntyre is a member of Return Path's analytics team. He started at Return Path as a Software Engineer in 2008. Nathan is a member of both the Denver R User Group and the Boulder/Denver BigData Meetup. He is an avid runner and dancer.

Members are also interested in