This is a combined meetup with the LA R meetup group. Hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data, and for computing descriptive and query types of analytics on that data. However, it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression, generalized linear models, and decision trees.
This presentation will explain and demonstrate how to use R with Hadoop for high-performance analytics that scale. We will present three R packages designed to work with Hadoop:
• Revolution R Enterprise ScaleR
Each of these R packages provides the Data Scientist the ability to work with data stored in Hadoop and leverage the full power of the MapReduce framework for model building, model estimating, data transformation and visualization.
Revolution Analytics presenter Bio's:
David Champagne is an innovative technology leader with over 20 years of experience in enterprise and web application development for business customers across a wide range of industries. As Chief Architect at Revolution Analytics he has led the development teams and has overall product responsibilities. Prior to joining Revolution Analytics, he was Principal Architect/Engineer for SPSS .
Antonio Piccolboni is a data scientist with both industrial and academic experience. His recent work includes the design and implementation of a big data analysis package in R, social network analysis for a top 20 global web site and web analytics for a major web ratings company. He is currently an independent consultant with clients including Dataspora and Revolution Analytics