addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1linklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Workflow Engines for Hadoop

Joe Crobak from Foursquare will give a brief overview of how a workflow engine fits into a standard Hadoop-based analytics stack, and an architectural overview of Azkaban, Luigi, and Oozie. He will elaborate on some features, tools, and best practices that will help you build out a Hadoop workflow system from scratch or improve an existing one.

 

About the talk:

Building a reliable pipeline of data ingress, batch computation, and data egress with Hadoop can be a major challenge. Most folks start out with cron to manage workflows, but soon discover that doesn't scale past a handful of jobs. There are a number of open-source workflow engines with support for Hadoop, including Azkaban (from LinkedIn), Luigi (from Spotify), and Apache Oozie. Having deployed all three of these systems in production, Joe will talk about what features and qualities are important for a workflow system.

 

About the speaker:

Joe Crobak worked on Hadoop and analytics infrastructure at Foursquare, where he built internal tools and APIs used by dozens of engineers and analysts on a daily basis.

Twitter: @joecrobak

 

Previous video:

Etsy on Skyline

http://g33ktalk.com/etsy-a-deep-dive-into-monitoring-with-skyline/

 

Join or login to comment.

Our Sponsors

  • Hakka Labs

    Growing the largest community of data engineers and data scientists

  • Spotify

    Big thanks to Spotify for helping support & host NYC Data Eng!

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy