57th Bay Area Hadoop User Group HUG Meetup


Details
It’s been a long time since our last meetup!
We are looking at rebooting this meetup to work with the community to share more info on recent releases and how folks can start to leverage the latest innovations.
Apache Hadoop has had several recent releases (3.0.x and 3.1.x) with many new enhancements. The community continues to innovate with upcoming release coming in the future. In this meetup, the Hadoop community members will share information, use cases and the community's experience. If you are interested in a particular topic or would like to speak at a future event, please reach out to the HUG meetup leadership team.
This meetup will be focused on Apache Hadoop 3.1. There will be summary talks on YARN and HDFS enhancements to start. In the next set of meetups, we invite community members for deeper dives.
6:30 – 6:55 PM Networking / Social
7:00 – 9:00 PM Presentations
YARN in Apache Hadoop 3.x: Updates & Demos
Description
Apache Hadoop YARN is the modern distributed operating system for big data applications. It morphed the Hadoop compute layer to be a common resource management platform that can host a wide variety of applications. Many organizations leverage YARN in building their applications on top of Hadoop without themselves repeatedly worrying about resource management, isolation, multi-tenancy issues, etc.
In this talk, we’ll start with the current status of Apache Hadoop YARN in Apache Hadoop 3.1.x —how it is used today. We'll then cover the present and future of YARN—features that are further strengthening YARN as the first class resource management platform for data centers running enterprise Hadoop.
Speaker(s)
Wangda Tan / Vinod Kumar Vavilapalli
Storage in Apache Hadoop 3.x: HDFS’s evolution and Ozone Introduction
Description
HDFS has several strengths: horizontal scaling of IO bandwidth over petabytes of storage. Further it provides very low latency metadata operations and scales to over 60K concurrent clients. Apache Hadoop 3.0 recently added Erasure Coding, Multiple NameNode support and HDFS federation improvements. We will talk about the latest HDFS enhancements in Apache Hadoop 3.0 and what the road ahead might look like...
One of HDFS’s limitations is scaling number of files and blocks in the system. We describe a radical change to Hadoop’s storage infrastructure with the upcoming Ozone filesystem. It allows Hadoop to scale to tens of billions of files. Ozone fundamentally separates the namespace layer and the block layers. Further, the use of RAFT protocol has allowed the storage layer to be self-consistent. We will provide a high-level overview of the Ozone architecture.
Speaker(s)
Hanisha Koneru/Arpit Agarwal

Sponsors
57th Bay Area Hadoop User Group HUG Meetup