Skip to content

Resource Management in Modern Hadoop Clusters

Photo of John Leach
Hosted By
John L.
Resource Management in Modern Hadoop Clusters

Details

Modern Hadoop clusters can share resources elastically between multiple frameworks -- each tailored to a specific usecase. In this talk, we delve into how YARN, Llama and other infrastructure components help achieve this. We elaborate on the technical and operational aspects of a typical cluster that shares a cluster between (1) MapReduce for traditional batch-processing (2) Spark for complex analytics and machine learning, (3) Spark-Streaming for stream-processing, and (4) Impala for interactive SQL.

This talk is presented by Karthik Kambatla, a Software Engineer at Cloudera, Hadoop Committer, Hadoop Project Management Committee member, and PhD student. He works primarily on scheduling and resource management in the Hadoop ecosystem.

Cloudera is sponsoring the food and drinks for tonight's meeting.

Photo of STL Big Data - Innovation, Data Engineering, Analytics Group group
STL Big Data - Innovation, Data Engineering, Analytics Group
See more events
Helix Center
1100 Corporate Square Drive Creve Coeur, MO, 63132 · Saint Louis, MO