Skip to content

July 2016 Meetup

July 2016 Meetup

Details

Thanks to Pinterest for hosting us! This location is walking distance from the Caltrain SF Station.

Agenda:

• 6pm Food, Drinks, Networking

• 630pm Pinterest Welcome

• 640pm Open Sourcing Dr. Elephant: Pills for your problematic Hadoop jobs

At LinkedIn, we have employees with different levels of experience with Hadoop using different frameworks to run their Hadoop jobs. User may run Hadoop job that abuses Hadoop cluster resources, but they even didn’t realize that. Many who realized their jobs are problematic didn’t know how to tune them. Dr. Elephant was born to solve these pain points. As a simple web app, it provides job diagnostics right after each job finishes and tell user what pills to take. We also built our automated job performance monitoring pipeline on top of it.

In this talk, I will share the experience at Linkedin in optimizing the user jobs, the challenges we faced and how a simple self serve tool like Dr. Elephant helped overcome these challenges. Dr. Elephant is a tool for the users of Hadoop to help them understand, analyse and tune their Hadoop/Spark applications easily, thus improving their productivity and the cluster’s efficiency. I’ll also share how we integrated such a tool into our developer lifecycle and encouraged them to optimize the jobs with minimal support from the hadoop experts. I will also discuss about the tool, how it gathers all the information and how to write custom heuristics and plug them into Dr. Elephant.

Dr. Elephant is now open sourced. Please check out our blog and github page for more information.

Photo of San Francisco Hadoop Users group
San Francisco Hadoop Users
See more events
Pinterest
580 7th Street · San Francisco, CA