Skip to content

Details

🎯 The Problem: Your ETL jobs take 6 hours. Business users want data "as fresh as possible." You're juggling Dataflows, Notebooks, Pipelines, and now Mirroring - but which one is actually right for which scenario?
💡 This Evening: We'll cut through the confusion with a clear decision framework. You'll see each tool in action and understand when to use Spark vs. Dataflows vs. Mirroring - with real-world examples.
⏱️ Save yourself: Weeks of painful refactoring. Choose the right ETL approach from day one.
What you'll learn:
⚙️ Spark Environments - Managing libraries and configurations, and WHY environment isolation matters for production
🚀 Spark Job Definitions - Automating Spark jobs at scale, and WHY scheduled jobs beat manual notebook runs
🔄 Data Factory Pipelines - Enterprise orchestration with activities, and WHY Fabric Pipelines are simpler than classic ADF
🪞 Database Mirroring - Near real-time replication from SQL Server, Cosmos DB, Snowflake, and WHY mirroring beats traditional CDC approaches
Decision Framework:

  • Simple transformations → Dataflow Gen2
  • Complex logic, ML → Notebook → Spark Job
  • Orchestration, dependencies → Data Pipeline
  • Real-time sync from source → Mirroring

Who should attend: Data Engineers building ETL pipelines, DBAs managing data integration, Architects designing data platforms
Agenda:

  • 18:30 - Welcome & Networking
  • 18:45 - Environments & Spark Jobs
  • 19:10 - Data Factory Pipelines Deep Dive
  • 19:35 - Database Mirroring - The Game Changer
  • 19:50 - ETL Decision Framework
  • 19:55 - Q&A and Discussion
  • 20:00 - Networking

Members are also interested in