Resource Management in Modern Hadoop Clusters


Details
Modern Hadoop clusters can share resources elastically between multiple frameworks -- each tailored to a specific usecase. In this talk, we delve into how YARN, Llama and other infrastructure components help achieve this. We elaborate on the technical and operational aspects of a typical cluster that shares a cluster between (1) MapReduce for traditional batch-processing (2) Spark for complex analytics and machine learning, (3) Spark-Streaming for stream-processing, and (4) Impala for interactive SQL.
This talk is presented by Karthik Kambatla, a Software Engineer at Cloudera, Hadoop Committer, Hadoop Project Management Committee member, and PhD student. He works primarily on scheduling and resource management in the Hadoop ecosystem.
Cloudera is sponsoring the food and drinks for tonight's meeting.

Resource Management in Modern Hadoop Clusters