Apache Hadoop 2 includes a new MapReduce engine, which has a number of advantages over the previous implementation, including better scalability and resource utilization. The new implementation is built on a general resource management system for running distributed applications called YARN. In this talk Tom will walk through the new architecture and compare and contrast MapReduce 2 running on YARN with the architecture of MapReduce 1.
Tom White is one of the foremost experts on Hadoop. He has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. His book Hadoop: The Definitive Guide (O'Reilly) is recognized as the leading reference on the subject. In 2011, Whirr, the project he founded to run Hadoop and other distributed systems in the cloud, became a top-level Apache project.
Tom is a software engineer at Cloudera, where he has worked since its foundation, on the core distributions from Cloudera and Apache. Previously he was an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net (http://java.net/) and IBM's developerWorks, and has spoken at several conferences, most recently at ApacheCon and OSCON in 2011. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.