Skip to content

Data Validation and Alerting. How does Airflow fit in?

Photo of Brian Lavery
Hosted By
Brian L.
Data Validation and Alerting.  How does Airflow fit in?

Details

Note: This meetup event is being organized as a special joint effort with the NYC Data Council Meetup group:
https://www.meetup.com/DataCouncil-AI-NYC-Data-Engineering-Science/

Talk 1: Data Validation and Alerting. How does Airflow fit in?
Abstract:
After your ETL runs, a new kind of fun starts.
-Is my output data 'right' compared to my 'source of truth'?
-Wait a second, how do I even know if my input data was ok?
-How do get alerted if a metric violates some threshold/tolerance or if some dimensional data is messed up?
-What if I want alerts to be triggered based on dynamic thresholds?
-How hard is it to maintain my checks and alerts?
Like everyone else, the New York Time's Data Engineers, Data Analysts and Data Scientists have been wrestling with the above questions. This presentation will cover what the Times has tried and the approach that's been settled on (for now). And yes, Airflow plays an important part.

Presenters:
Brian Lavery, Data Engineer, New York Times
Mariam Melikadze, Manager-Advertising Analytics, New York Times

Talk 2:
Abstract:
Apache Airflow is a Python-based task orchestrator that has seen widespread adoption among startups and enterprises alike to author, schedule, and monitor data workflows.
By deploying the Airflow stack via Helm on Kubernetes, fresh environments can be easily spun up or down, scaling to near 0 when no jobs are running. As companies scale up their Airflow usage, they need more control, and observability over their stack as it becomes more ingrained into their culture and more important to the business.
This talk will go through the technical challenges of supporting thousands of airflow deployments, how to monitor them, reliably push DAG updates, and how to build all the supporting infrastructure of a rock-solid Airflow system in a cloud native environment using open source software.

Presenter:
Viraj Parekh, Data Engineer, Astronomer

Pizza, drinks and mingling will follow the presentation.

Instructions to follow upon arrival:
Enter the lobby on the north side of the building. A representative will be waiting next to one of north end elevator turnstiles with a sign that says 'Airflow Meet-Up'. They will assist you in getting through security and send you up to the 15th floor where another representative will be waiting to direct you to the room.

Photo of NYC Apache Airflow Meetup group
NYC Apache Airflow Meetup
See more events
620 8th Ave
620 8th Ave · New York, NY