The RHadoop Project


Details
BARUG continues its series on R and Hadoop on Tuesday Oct 11 with a presentation from Antonio Piccolboni, developer on the RHadoop Project (https://github.com/RevolutionAnalytics/RHadoop/wiki). We'll also have a lightning talk about Quantbench, a platform for financial data analysis and exploration. As usual, networking and refreshments will start at 6:30, followed at 7:00 by a lightning talk and our main presentation. Thanks to eBay for providing a venue for this month's meeting.
Agenda
6:30 - 7:00 Networking and pizza (sponsored by Revolution Analytics)
7:00 - 7:10 Introductions and Announcements
7:20 - 7:30 Lightning Talk
Paul Sutter, Introduction to Quantbench
7:30 - 8:30 Keynote Presentation
Antonio Piccolboni, The RHadoop Project
rmr is a new package that allows to perform mapreduce computations in R, part of the RHadoop open source project connecting R and the Hadoop ecosystem, spearheaded by Revolution Analytics. In this session I will show what the package can do and cover several examples from machine learning. En route I will try to convince you that we did strike the right compromise of power and usability and that you should contribute to this project.
About the speakers:
Antonio Piccolboni is a data scientist with both industrial and academic experience. His recent work includes the design and implementation of a big data analysis package in R, social network analysis for a top 20 global web site and web analytics for a major web ratings company. He is currently an independent consultant with clients including Dataspora and Revolution Analytics. He blogs at blog.piccolboni.info (http://blog.piccolboni.info/) about big data and analytics. His papers have received more than 800 citations and his Erdős number is 3.
Paul Sutter is Quantbench's Co-Founder and CEO. He most recently held the role of President and Co-Founder of Quantcast. He was instrumental in the development of Quantcast’s distributed computing architecture, which collects 12 billion records per day, processing petabytes of data on a daily basis, with over 10m customer websites. In 2010 the company was recognized as the 3rd most innovative web company by Fast Company (behind Facebook and Google). Prior to co-founding Quantcast, Paul started the WAN optimization company Orbital Data, which was acquired by Citrix in August 2006. Previously, he had also founded Transium, an internet search services company, which was acquired by AltaVista in 2000.

Sponsors
The RHadoop Project