Airflow & Luigi - the Flow of your Data


Details
18:00 - 18:30 - Mingling
18:30 - 19:15 - Orchestrating in the Wild with Airflow - Iddo Aviram @ Similar Web
19:15 - 20:00 - Luigi on Steroids: workflow == code - Evgeny Shulman @ Crosswise - Oracle Data Cloud
Talk Title: Orchestrating in the Wild with Airflow
Abstract:
Airflow is a open source framework we just adore at SimilarWeb R&D. In a big team-wide effort, we migrated all of our workflows to Airflow, and the transition has really proven to be worthwhile.
I’ll analyze some design properties that give Airflow an edge over other similar frameworks like Luigi, Oozie and Azkaban, and talk about what a production deployment of Airflow looks like in practice.
I’ll also talk about some of the struggles we’re still facing and how we’re trying to address them. Although Airflow shines for many use-cases, pain points still exist, and I’ll discuss challenges like orchestrating large-scale data reruns using the framework.
Bio:
Iddo Aviram is a data engineer at SimilarWeb. Formerly, he was a system architect at Totango, and graduated M.Sc in the computer science dept. of Ben-Gurion University of the Negev.
Talk Title: Luigi on Steroids: workflow == code - Evgeny Shulman @ Crosswise - Oracle Data Cloud
Abstract:
Why is any production workflow should be defined by code? At crosswise we run on Terabytes of data, we have more than 500 logical tasks in every flow, we recreate models on the fly, evaluate them and apply them in every run, and we don’t hesitate for a moment when we want to change our pipelines. We are doing that Luigi way. We will talk on jobs/task wiring, production issues and solutions for that, what are the important points when you select right workflow solution for you.
Bio:
Evgeny (https://www.linkedin.com/in/shulmane) has been part of the technology industry for over 12 years in many different positions and areas including low level drivers and high availability servers, flow automation and machine learning systems. Has specific experience from the retargeting industry and understand how big data drives results in adtech. At his current position at Crosswise, he’s responsible for planning the architecture that enables the Crosswise product to process terabytes of data efficiently.

Airflow & Luigi - the Flow of your Data