

What we’re about
🌟 Welcome to the Pune Apache Airflow Enthusiasts Meetup! 🚀
Are you passionate about orchestrating data workflows, automating tasks, and streamlining data pipelines? Join us in the heart of Pune, India, where we gather to explore the exciting world of Apache Airflow.
🧠 Who Are We?
We are a community of data engineers, developers, data scientists, and cloud enthusiasts who share a common interest in Apache Airflow. Our goal is to create a vibrant platform for learning, networking, and collaboration.
🔍 What Do We Do?
Discover the latest trends and best practices in data orchestration.
Share real-world use cases and success stories.
Dive into hands-on workshops and tutorials.
Connect with like-minded professionals in the field.
Foster a supportive and inclusive environment for all skill levels.
🙋♀️ Who Should Join?
Whether you're a seasoned Apache Airflow pro or just getting started, this meetup is for you! Our diverse group welcomes beginners, experts, and everyone in between who is eager to expand their knowledge and skills in data orchestration.
💡 Why Join Us?
By joining the Pune Apache Airflow Enthusiasts, you'll gain valuable insights, make new connections, and stay at the forefront of this dynamic field. Plus, you'll be part of a vibrant community passionate about driving innovation in data automation.
Ready to embark on this journey with us? Join our meetup group today and be part of the exciting Apache Airflow discussions in Pune!
Sponsors
See allUpcoming events (1)
See all- Pune Apache Airflow® Meetup at Pattern!Pattern Technologies India Private Limited, Pune
Join fellow Apache Airflow enthusiasts at Pattern for an afternoon of engaging presentations, great conversation, delicious food and beverages, and of course, swag!
The evening will consist of two presentations followed by networking. We can't wait to see you there!
PRESENTATIONS
Presentation #1: Orchestrating dbt-core Data Pipelines with Cosmos and Apache Airflow 3
- Speaker: Pankaj Koti, Open Source Engineer at Astronomer & Apache Airflow Committer
As organisations increasingly adopt dbt for data transformation and Apache Airflow for orchestration, there is a growing need for seamless integration between the two. Cosmos is an open-source framework designed to bridge this gap, enabling native, scalable, and testable dbt execution within Airflow DAGs. With the upcoming release of Apache Airflow 3, significant enhancements around performance, maintainability, and extensibility set the stage for an even tighter integration and better developer experience.
Cosmos simplifies the way data teams run dbt by translating dbt models, seeds, and snapshots into Airflow tasks, allowing fine-grained visibility, parallel execution, and native scheduling. It supports both local and remote dbt execution contexts, with compatibility for cloud-native environments like Kubernetes & Docker platforms. Key features include dependency-based task generation, cache management, remote storage support, and first-class integration with Airflow.
In this talk, we’ll explore:
- Why dbt + Airflow is such a powerful combo
- How Cosmos translates your dbt project into Airflow DAGs
- Cosmos DAGs running in Airflow 3
- Best practices, patterns, and lessons learned from real-world deployments
Whether you’re new to Airflow or already running dbt in production, you’ll walk away with practical ideas and tools to improve your data platform using Cosmos and Airflow 3.
---------------------------------------------------------------------------------------
Presentation #2: Accelerating the Data Lake Journey with Airflow and Heimdall
- Speaker: Sanket Jadhav, Principal Data Engineer at Pattern
Discover how we integrate Airflow and Heimdall, an open-source lightweight and pluggable job execution platform developed at Pattern. Heimdall offers a secure, consistent way for job management, abstracting underlying data infrastructure complexities. We will explore how our custom Airflow operators facilitate this integration, utilizing the Heimdall API to direct specific commands and clusters for job execution on various data processing engines. The synergy of Heimdall and Airflow empowers our data lake architecture by optimizing data pipelines, and improving overall operational efficiency.
---------------------------------------------------------------------------------------
Presentation #3: Data aware scheduling using Assets (Previously known as datasets)
- Speaker: Vallabh Ghodke, Data Engineer at Pattern
In this presentation, we'll look into the basics of data aware scheduling going beyond conventional cron based scheduling covering the below topics:
- Use case for assets
- @asset syntax to create data-oriented pipelines.
- define Airflow tasks as producers of assets.
- run DAGs based on basic and advanced asset schedules.
- use asset aliases to create dynamic asset schedules
- attach information to, and retrieve information from, asset events.
---------------------------------------------------------------------------------------
AGENDA
- 11-11:30 AM: Arrivals, Networking, Enjoy Food & Drinks
- 11:30 AM-1 PM: Presentations
- 1- 2 PM: Networking & Enjoy Food & Drinks