We will be meeting in room 201 at the Meeting House in Columbia, MD.
Rhadoop Data Hacking
Ed Kohlwey, Booz Allen Hamilton
Rhadoop is an effective platform for doing exploratory data analysis over big data sets. The convenience of an interactive command-line interpreter and the overwhelming number of statistical and machine learning routines implemented in R libraries make a highly effective environment to perform elementary data science.
We'll discuss the basics of RHadoop: what it is, how to install it, and the API fundamentals. Next we'll discuss common use cases that you might want to use RHadoop for. Last, we'll run through an interactive example.
Ed Kohlwey is a developer/hacker/data scientist at Booz Allen Hamilton. He is generally interested by parallel computation and data analytics, and has worked in many problem domains including cyber, genomics, and finance. Ed is also one of the Meetup group coordinators.
Rather than having a second presenter, we're going to try something new: lightning talks. This is a free-for-all: give a presentation on any topic that you think will interest the group. If its bad, we'll probably just boo you off the stage ;).
To be fair, we DO ask that you sign up here to give your talk. That way people will know if they have a shot of giving their talk, and people can also see the general talk topics that will be given and decide weather or not they want to stick around. Heres the general rules:
- Talks will be given priority on a first-register-first-serve basis.
- Talks should be limited to 10 minutes or less.
- Talks can be on any topic you think would be interesting to the group.
- If you have materials for your talk, e-mail them to Ed Kohlwey in Powerpoint format no later than 5pm EST on the August 20th at kohlwey_edmundatbahdotcom. Ed will consolidate them into a single, in-order deck to facilitate speed.
- Approximately 90 minutes are allocated for talks and we will do as many of them as possible. That means probably 9-10 talks.
- Q+A should be reserved for the general meandering that always follows the meeting.
5:30-6:00 pm - Networking and snacks
6:00-6:10 pm - Announcements and kickoff
6:10-6:45 pm - Rhadoop presentation
6:45-8:15 pm - Lightning talks