Speeding up R!!!


Details
UPDATED AGENDA
NOTE: Program will start at 7pm sharp. Please try to arrive before 7. It is likely that someone will be manning the door downstairs until about 7:10 or so, but we can't promise much after that.
6:30 - 7:00 Eat/Drink/Network
7:00 - 7:10 Intro and announcements
7:10 - 7:20 10-minute R code warm-up: "Building a Violin - with R." with Mike Messner
7:20 - 8:40 Talks:
- Kenny Darrell from Elder Research will talk about using Revolution R Enterprise to process large datasets in chunks
- Marck Vaisman on tips for speeding up everyday R use, including reading data, cleaning data, and the magic of parallelization with lapply, plyr doMC and multicore.
8:45ish off to beeR! (place TBD)
Thanks to HelloWallet for hosting us once again!
---- Additional Info Below ---
We are devoting this months to arming you with the knowledge to address one of R's soft spots. Speed. When your data gets big enough or your analysis get complex enough, R can slow down. We are going to present some strategies and tools to get around these bottlenecks.
Our 10-minute code warm-up: "Building a Violin - with R." with Mike Messner (https://www.meetup.com/R-users-DC/members/10227539/t/mm1_l3)
Mike is going to talk about his project with using R build REAL things. He will share how to use R to generate G-code for computer numerical control (CNC). Yeah, i had to wiki this too. http://en.wikipedia.org/wiki/Numerical_control
Introduction to Revolution's RevoScaleR Package - Kenny Darrell of Elder Research Inc.
There have been many statements made about the R programming language. It’s been called the de facto standard for statistical computing, an odd or quirky language, and a platform that cannot handle “Big Data”. I think as time progresses the first point is becoming true. The second is true, and I do not see that as a bad thing as it’s used for statistics programming and not systems programming. I have heard the third point the most frequently and depending on your definition of “Big Data” it is most likely a myth. There isn’t one go to method when you need to step outside of the in-memory RAM approach. I will demonstrate some of the use cases for one particular method; data is stored in XDF files and is read into memory in chunks. This functionality is made possible by the RevoScaleR package that is included in Revolution R Enterprise from Revolution Analytics. I will walk through how to implement a common predictive model with data that will not fit in RAM, and where it is different from the standard approach.
Kenny Darrell’s experience spans both private and government sectors as well as both small and large companies. He has applied machine learning techniques approaches to problems in the aerospace industry as a control systems engineer. He has used image recognition methods to detect plume signatures in low signal to noise environments which drive adaptive control systems to anticipate and track their trajectories. He has also worked in the area of health monitoring and fault detection for large jet engine programs and has built software to allow the visualization of these events in both time and frequency domains. Currently he is a Data Mining Analyst for Elder Research. There he builds models that help illuminate fraud schemes and also implements these models into systems that will allow for future fraud detection and rare event/anomaly detection. He has a Master of Science in Quantitative Analysis and a Bachelor of Science in Aerospace Engineering; both from the University of Cincinnati.

Speeding up R!!!