Building Scalable SCD Type 2 Pipelines in MS Fabric DW Using T-SQL
Details
Data Engineers in Toronto June 2026 Semimonthly Meeting
Topic: Building Scalable SCD Type 2 Pipelines in MS Fabric DW Using T-SQL
Abstract:
Implementing Slowly Changing Dimension (Type 2) at scale is critical for maintaining historical accuracy in analytics, but doing so efficiently across billions of rows in Microsoft Fabric Data Warehouse requires leveraging its modern ingestion and optimization capabilities.
In this session, we’ll build a Fabric-optimized SCD Type 2 pipeline using pure T-SQL patterns. We’ll start by comparing ingestion strategies like OPENROWSET for schema-on-read exploration and COPY INTO for high-throughput, parallelized loading—and explain why COPY INTO is the preferred method for large-scale ingestion in Fabric Warehouses.
Next, we’ll implement incremental load logic without MERGE (since Fabric does not currently support the MERGE statement) by using UPDATE + INSERT patterns combined with hash-based change detection and filtered indexes for current-row lookups. We’ll also cover performance accelerators like batching, minimal logging, and distribution strategies to maximize query performance.
Finally, we’ll demonstrate a full end-to-end pipeline:
- Discover external Parquet/CSV files with OPENROWSET
- Ingest into Fabric Warehouse using COPY INTO
- Apply SCD Type 2 logic using merge-like T-SQL patterns for historical tracking
You’ll leave with a production-ready template and a Fabric-specific performance playbook for handling incremental loads at scale with minimal friction.
Speaker: Jean Joseph, Principal Data & AI Engineer @Tech-Insight-Group LLC
Speaker Profile:
Jean Joseph is a seasoned consultant and senior technical trainer specializing in data engineering and artificial intelligence. With a strong background in database design, administration, and cutting-edge data technologies including machine learning and generative AI.
He helps organizations build secure, scalable solutions across both legacy systems and modern cloud platforms. Formerly recognized as a Microsoft MVP and senior technical trainer at Microsoft, Jean brings deep technical insight and a passion for teaching.
He’s also a dynamic speaker, mentor, and the founder of the Cloud Data Driven User Group and the Future Data Driven Summit, where he champions innovation and promotes responsible use of emerging tech within the data community.
The meeting is over Microsoft Teams, and the joining link is https://teams.microsoft.com/l/meetup-join/19%3ameeting_NzZkYWIyOTAtODk1MC00MjVmLWJlNjUtNTRiODZmODA2Zjdh%40thread.v2/0?context=%7b%22Tid%22%3a%22bd9727e8-f539-4c76-983c-6c30130c0bee%22%2c%22Oid%22%3a%229e8d5a64-e773-4ca2-90f6-9a266129171e%22%7d
See you at the meeting!
