Due to an unfortunate flight delay, we have a last minute change in speakers for this meeting. We'll have Apache Hadoop committer and Yahoo technical lead Robert Evans presenting on YARN in Arun's place. Bobby is currently Senior Software Engineer and Technical Lead at Yahoo! Inc. developing machine learning software on Hadoop. He is a core committer to Apache Hadoop and has been part of the development team building MapReduce and HDFS for nearly six years. For the past year, Bobby has focused on the next generation of MapReduce and Apache YARN.
Fortunately, Arun should still be able to join us via Skype, so have your questions ready for him.
We're extremely excited to welcome Arun Murthy, the Chair of the Apache Hadoop PMC and co-founder of Hortonworks, to our August Meetup to talk about YARN, the next-gen application framework for Hadoop. We're also excited to be hosting our first event at the great meeting space at 1871, thanks to the good folks at Hortonworks. More info below – look forward to seeing everyone there!
HortonWorks will be generously providing food and drink for the meeting attendees.
YARN - Future of Data Processing with Apache Hadoop
Presented by VP, Apache Hadoop, Arun Murthy
Apache Hadoop MapReduce has been overhauled to emerge as Apache Hadoop YARN, a generic distributed application framework to support MapReduce and other application paradigms. This change recasts Hadoop as a much more powerful data-processing system making it very different from itself 12 months ago.
The fundamental idea of YARN is to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons, the global ResourceManager (RM) and per-application ApplicationMaster (AM). The ResourceManager and per-node slave, the NodeManager (NM), form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.
This talk will cover more of YARN design and architecture and how it improves Apache Hadoop to process data better via Hadoop Map-Reduce and allows for other programming paradigms on Hadoop grids.
Arun is VP, Apache Hadoop at the Apache Software Foundation i.e. Chair of the Apache Hadoop PMC and has been a full time contributor to Hadoop since the project inception in 2006. He is also the lead of the MapReduce project and has focused on building NextGen MapReduce (YARN). Prior to co-founding Hortonworks, Arun was responsible for all MapReduce code and configuration deployed across the 42,000+ servers at Yahoo!. In essence, he was responsible for running Apache Hadoop’s MapReduce as a service for Yahoo!. Also, he jointly holds the current world sorting record using Apache Hadoop. Follow Arun on Twitter: @acmurthy.