Skip to content

YARN:Overview,Tez:Faster query processing on Hadoop-Siddharth Seth/Hortonworks

Photo of Siobhain Taylor
Hosted By
Siobhain T.
YARN:Overview,Tez:Faster query processing on Hadoop-Siddharth Seth/Hortonworks

Details

Apache Hadoop YARN - an overview
Apache Hadoop 2 went GA recently. This includes YARN, which allows additional applications other than MapReduce to run on a Hadoop cluster, by providing interfaces for cluster resource scheduling and application management.
The initial part of the presentation will talk about the YARN architecture and the advantages over Hadoop1 (JobTracker). It will also touch upon user requirements to migrate their MapReduce etc jobs from Hadoop1 to Hadoop2.

Apache Tez: taking Hadoop computation beyond MapReduce
Apache Tez is a distributed execution framework which allows for the execution of a complex DAG of tasks. This fits in well with Hive and PIG - which were otherwise forced to model their execution plans into multiple MapReduce jobs. This presentation will talk about what Tez is, the features it provides and what is coming. It will also cover some basics on how Hive is already making use of Tez to improve query runtimes.

Bio: Siddharth Seth
Siddharth Seth works as a software engineer at Hortonwork, where he works on the Apache Tez project and the Apache Hadoop project – with a focus on YARN and MapReduce. He is a member of the Apache Tez PPMC and the Apache Hadoop PMC. Prior to this he spent several years working on search platforms and Oozie at Yahoo.

Bio: Xuan Gong
Xuan Gong work as a software Engineer at Hortonworks, where he focuses on Hadoop YARN and MapReduce. Prior to this, he spend one and half years working as a software engineer on big data processing at ADP.

Photo of Los Angeles AI/LLMs/ML Meetup group
Los Angeles AI/LLMs/ML Meetup
See more events
Shopzilla
12200 Olympic Blvd, 4th floor · Los Angeles, CA