ETL Strategies in Microsoft Fabric - Pipelines, Dataflows & Notebooks
Details
🎯 The Problem: Your ETL jobs take 6 hours. Business users want data "as fresh as possible." You're juggling Dataflows, Notebooks, Pipelines, and now Mirroring - but which one is actually right for which scenario?
💡 This Evening: We'll cut through the confusion with a clear decision framework. You'll see each tool in action and understand when to use Spark vs. Dataflows vs. Mirroring - with real-world examples.
⏱️ Save yourself: Weeks of painful refactoring. Choose the right ETL approach from day one.
What you'll learn:
⚙️ Spark Environments - Managing libraries and configurations, and WHY environment isolation matters for production
🚀 Spark Job Definitions - Automating Spark jobs at scale, and WHY scheduled jobs beat manual notebook runs
🔄 Data Factory Pipelines - Enterprise orchestration with activities, and WHY Fabric Pipelines are simpler than classic ADF
🪞 Database Mirroring - Near real-time replication from SQL Server, Cosmos DB, Snowflake, and WHY mirroring beats traditional CDC approaches
Decision Framework:
- Simple transformations → Dataflow Gen2
- Complex logic, ML → Notebook → Spark Job
- Orchestration, dependencies → Data Pipeline
- Real-time sync from source → Mirroring
Who should attend: Data Engineers building ETL pipelines, DBAs managing data integration, Architects designing data platforms
Agenda:
- 18:30 - Welcome & Networking
- 18:45 - Environments & Spark Jobs
- 19:10 - Data Factory Pipelines Deep Dive
- 19:35 - Database Mirroring - The Game Changer
- 19:50 - ETL Decision Framework
- 19:55 - Q&A and Discussion
- 20:00 - Networking
