Pune Apache Airflow® Meetup at Pattern!


Details
Join fellow Apache Airflow enthusiasts at Pattern for an afternoon of engaging presentations, great conversation, delicious food and beverages, and of course, swag!
The evening will consist of two presentations followed by networking. We can't wait to see you there!
PRESENTATIONS
Presentation #1: Orchestrating dbt-core Data Pipelines with Cosmos and Apache Airflow 3
- Speaker: Pankaj Koti, Open Source Engineer at Astronomer & Apache Airflow Committer
As organisations increasingly adopt dbt for data transformation and Apache Airflow for orchestration, there is a growing need for seamless integration between the two. Cosmos is an open-source framework designed to bridge this gap, enabling native, scalable, and testable dbt execution within Airflow DAGs. With the upcoming release of Apache Airflow 3, significant enhancements around performance, maintainability, and extensibility set the stage for an even tighter integration and better developer experience.
Cosmos simplifies the way data teams run dbt by translating dbt models, seeds, and snapshots into Airflow tasks, allowing fine-grained visibility, parallel execution, and native scheduling. It supports both local and remote dbt execution contexts, with compatibility for cloud-native environments like Kubernetes & Docker platforms. Key features include dependency-based task generation, cache management, remote storage support, and first-class integration with Airflow.
In this talk, we’ll explore:
- Why dbt + Airflow is such a powerful combo
- How Cosmos translates your dbt project into Airflow DAGs
- Cosmos DAGs running in Airflow 3
- Best practices, patterns, and lessons learned from real-world deployments
Whether you’re new to Airflow or already running dbt in production, you’ll walk away with practical ideas and tools to improve your data platform using Cosmos and Airflow 3.
---------------------------------------------------------------------------------------
Presentation #2: Accelerating the Data Lake Journey with Airflow and Heimdall
- Speaker: Sanket Jadhav, Principal Data Engineer at Pattern
Discover how we integrate Airflow and Heimdall, an open-source lightweight and pluggable job execution platform developed at Pattern. Heimdall offers a secure, consistent way for job management, abstracting underlying data infrastructure complexities. We will explore how our custom Airflow operators facilitate this integration, utilizing the Heimdall API to direct specific commands and clusters for job execution on various data processing engines. The synergy of Heimdall and Airflow empowers our data lake architecture by optimizing data pipelines, and improving overall operational efficiency.
---------------------------------------------------------------------------------------
Presentation #3: Data aware scheduling using Assets (Previously known as datasets)
- Speaker: Vallabh Ghodke, Data Engineer at Pattern
In this presentation, we'll look into the basics of data aware scheduling going beyond conventional cron based scheduling covering the below topics:
- Use case for assets
- @asset syntax to create data-oriented pipelines.
- define Airflow tasks as producers of assets.
- run DAGs based on basic and advanced asset schedules.
- use asset aliases to create dynamic asset schedules
- attach information to, and retrieve information from, asset events.
---------------------------------------------------------------------------------------
AGENDA
- 11-11:30 AM: Arrivals, Networking, Enjoy Food & Drinks
- 11:30 AM-1 PM: Presentations
- 1- 2 PM: Networking & Enjoy Food & Drinks

Sponsors
Pune Apache Airflow® Meetup at Pattern!