Detailed agenda and summaries to follow. General agenda:
- 6:00 - 6:30 - Socialize over food and beer(s)
- 6:30 - 7:00 - Next Generation Hadoop
- 7:00 - 7:30 - Next Generation Hadoop Operations at Facebook
Next Generation Hadoop MapReduce: Talk about Apache Hadoop[masked] branch and the next generation of Hadoop MapReduce. The Apache Hadoop MapReduce framework has hit a scalability limit around 4,000 machines. We are developing the next generation of Apache Hadoop MapReduce that factors the framework into a generic resource scheduler and a per-job, user-defined component that manages the application execution. Since downtime is more expensive at scale high-availability is built-in from the beginning; as are security and multi-tenancy to support many users on the larger clusters. The new architecture will also increase innovation, agility and hardware utilization.
Presenter: Arun C Murthy, Yahoo!
Next Generation Hadoop Operations at Facebook: Hadoop's traditional role as a framework for batch-oriented execution of map-reduce jobs is rapidly expanding to include many other use cases, such as Hbase, Scribe, and low latency ad-hoc queries of large datasets. Downtime is becoming less acceptable, and existing map-reduce jobs continue to get larger, with tighter expectations around completion time. Storage and data retention requirements continue to grow. Quite simply, it is both an amazing and extremely challenging time to be a Hadoop administrator.
In this talk we will discuss the challenges facing our operations team in 2011 growing and managing a variety of Hadoop clusters throughout Facebook, and the solutions we are developing to address them. We will share our key best practices and lessons learned, and how they can be applied to any organization.
Presenter: Andrew Ryan, Facebook
Yahoo Campus Map:
Location on Wikimapia: