R and Big Data - Combined Meetup
Details
We are very excited to have our first Meetup combining the R and Big Data communities in Denver! Note there have been a number of comments requesting more social/networking time so we have adjusted the agenda accordingly. Any other recommendations on how to improve the meetup are always welcome.
Hope you can make it!
Agenda
6:30 - 7:30 - Socialize over food and drink 7:30 - 7:40 - Announcements 7:40 - 8:15 - R and Hadoop presentation 8:15 - ?:?? - Continued Socializing.
Presentations
Using R with Hadoop - Nathan will present an example of using R to write a Hadoop streaming job for mapping scores to vectors of data. We tested on 34.5M records stored on the Hadoop cluster.
This talk will cover:
- setting up a Hadoop streaming job
- using Rscript to run R on the command line
- writing R code that reads from stdin
- adjusting the number of mappers to change performance
- our optimizations to the R code so far
- our plans for future tests
About the speakers:
Nathan McIntyre is a member of Return Path's analytics team. He started at Return Path as a Software Engineer in 2008. Nathan is a member of both the Denver R User Group and the Boulder/Denver BigData Meetup. He is an avid runner and dancer.
