

About us
Knowledge sharing and networking group for data engineers in Greater Toronto Area. We do twice a month expert-led webinar sessions that cover important and valuable skills/knowledge a data engineer, an AI engineer and a BI engineer should have.
We have a Slack social space: https://dataengineersto.slack.com/join/shared_invite/zt-3tdhj245t-vcU5zF1CTDPaCFp8nfF8Sw#/shared-invite/email
Fabric Community page with occasional benefits: https://community.fabric.microsoft.com/t5/Data-Engineers-In-Toronto/gh-p/DataEngineersInToronto
LinkedIn Page: https://www.linkedin.com/company/data-engineers-in-toronto/
LinkedIn Group: https://www.linkedin.com/groups/18627021/
Most meetings are virtual meetings on Microsoft Teams. If interested in presenting, fill out your session details on Sessionize, https://sessionize.com/data-engineers-in-toronto
Upcoming events
8

Maximize SQL Server Performance with Read Committed Snapshot Isolation
·OnlineOnlineData Engineers in Toronto May 2026 Semimonthly Meeting
Topic: Maximize SQL Server Performance with Read Committed Snapshot Isolation
Abstract:
Are your read operations frequently blocked by write operations? Do you want to retrieve data without relying on NOLOCK?Then it's time to switch to Read Committed Snapshot Isolation (RCSI).
In this session, I’ll explain what RCSI is, how it works, and how it differs from the default Read Committed Isolation level. I’ll demonstrate how version store comes into play with this isolation level and when it can overwhelm TempDB if not managed properly. Finally, I’ll discuss how to manage the implications of enabling RCSI.
By the end of this session, you'll have a clear understanding of RCSI and when to implement it in your environment for improved concurrency and performance.
Speaker: Haripriya Naidu, Lead SQL Server Database Administrator
Speaker Profile:
Haripriya Naidu has been working as a SQL Server Database Administrator for 11 years in Boston, MA. She is passionate about databases and enjoys diving into SQL Server engine internals and optimizing performance. She is an AWS Certified Solutions Associate and won "Rookie of the Year" Award at work in 2024 for outstanding performance and "Bookworm of the Year" Award at work in 2025 for continuously learning and sharing knowledge with the team.
She recently started her speaking journey with New Stars of Data and spoke at SQL Saturdays Boston, Albany and San Diego and PASS Summit 2024.The meeting is over Microsoft Teams, and the joining link is https://teams.microsoft.com/l/meetup-join/19%3ameeting_NzZkYWIyOTAtODk1MC00MjVmLWJlNjUtNTRiODZmODA2Zjdh%40thread.v2/0?context=%7b%22Tid%22%3a%22bd9727e8-f539-4c76-983c-6c30130c0bee%22%2c%22Oid%22%3a%229e8d5a64-e773-4ca2-90f6-9a266129171e%22%7d
See you at the meeting!
11 attendees
Data Modernization at Traditional Banks - From Legacy Mainframe to Cloud
·OnlineOnlineData Engineers in Toronto May 2026 Semimonthly Meeting
Topic: Data Modernization at Traditional Banks - From Legacy Mainframe to Cloud
Abstract:
As technology evolved from the age of industrialization to the age of information, and finally to this current age of AI, the average lifecycle of technology shrank from 10 to 6, and now it is 3 years. Therefore, the definition of “legacy” has shifted. Applications developed before 2010, especially those built on mainframes or outdated technologies, are increasingly seen as candidates for modernization.Mainframe transformation has been a top priority in financial services because of its operational inefficiencies and high maintenance costs associated with mainframes when compared to modern distributed systems. Still, the majority of the core processing engines within financial services depend on Mainframe as the mission-critical business rules are embedded
Speaker: Vishal Sharma, VP - Software Engineering at Broadridge
Speaker Profile:
As a Software Engineering leader, I bring over 20 years of expertise in enterprise architecture and digital transformation. By leading cross-functional teams, I enable the delivery of secure, scalable solutions that optimize operations and enhance client engagement. My efforts focus on integrating innovative technologies to align with strategic business goals, resulting in measurable outcomes for stakeholders.With a strong foundation in business architecture and fintech innovation, I excel in modernizing platforms to meet evolving industry demands. Equipped with certifications such as TOGAF, I prioritize seamless system integration, cost efficiency, and regulatory alignment to drive long-term value for organizations. Passionate about solving complex challenges, I work collaboratively to empower teams and achieve sustainable success.
The meeting is over Microsoft Teams, and the joining link is https://teams.microsoft.com/l/meetup-join/19%3ameeting_NzZkYWIyOTAtODk1MC00MjVmLWJlNjUtNTRiODZmODA2Zjdh%40thread.v2/0?context=%7b%22Tid%22%3a%22bd9727e8-f539-4c76-983c-6c30130c0bee%22%2c%22Oid%22%3a%229e8d5a64-e773-4ca2-90f6-9a266129171e%22%7d
See you at the meeting!
12 attendees
Building Scalable SCD Type 2 Pipelines in MS Fabric DW Using T-SQL
·OnlineOnlineData Engineers in Toronto June 2026 Semimonthly Meeting
Topic: Building Scalable SCD Type 2 Pipelines in MS Fabric DW Using T-SQL
Abstract:
Implementing Slowly Changing Dimension (Type 2) at scale is critical for maintaining historical accuracy in analytics, but doing so efficiently across billions of rows in Microsoft Fabric Data Warehouse requires leveraging its modern ingestion and optimization capabilities.In this session, we’ll build a Fabric-optimized SCD Type 2 pipeline using pure T-SQL patterns. We’ll start by comparing ingestion strategies like OPENROWSET for schema-on-read exploration and COPY INTO for high-throughput, parallelized loading—and explain why COPY INTO is the preferred method for large-scale ingestion in Fabric Warehouses.
Next, we’ll implement incremental load logic without MERGE (since Fabric does not currently support the MERGE statement) by using UPDATE + INSERT patterns combined with hash-based change detection and filtered indexes for current-row lookups. We’ll also cover performance accelerators like batching, minimal logging, and distribution strategies to maximize query performance.
Finally, we’ll demonstrate a full end-to-end pipeline:
- Discover external Parquet/CSV files with OPENROWSET
- Ingest into Fabric Warehouse using COPY INTO
- Apply SCD Type 2 logic using merge-like T-SQL patterns for historical trackingYou’ll leave with a production-ready template and a Fabric-specific performance playbook for handling incremental loads at scale with minimal friction.
Speaker: Jean Joseph, Principal Data & AI Engineer @Tech-Insight-Group LLC
Speaker Profile:
Jean Joseph is a seasoned consultant and senior technical trainer specializing in data engineering and artificial intelligence. With a strong background in database design, administration, and cutting-edge data technologies including machine learning and generative AI.He helps organizations build secure, scalable solutions across both legacy systems and modern cloud platforms. Formerly recognized as a Microsoft MVP and senior technical trainer at Microsoft, Jean brings deep technical insight and a passion for teaching.
He’s also a dynamic speaker, mentor, and the founder of the Cloud Data Driven User Group and the Future Data Driven Summit, where he champions innovation and promotes responsible use of emerging tech within the data community.
The meeting is over Microsoft Teams, and the joining link is https://teams.microsoft.com/l/meetup-join/19%3ameeting_NzZkYWIyOTAtODk1MC00MjVmLWJlNjUtNTRiODZmODA2Zjdh%40thread.v2/0?context=%7b%22Tid%22%3a%22bd9727e8-f539-4c76-983c-6c30130c0bee%22%2c%22Oid%22%3a%229e8d5a64-e773-4ca2-90f6-9a266129171e%22%7d
See you at the meeting!
4 attendees
Data Mesh as the Foundation for AI/ML in Financial Services
·OnlineOnlineData Engineers in Toronto June 2026 Semimonthly Meeting
Topic: Data Mesh as the Foundation for AI/ML in Financial Services
Abstract:
Financial institutions want AI/ML at scale, but brittle data pipelines, silos, and compliance demands slow progress. This talk shows how a Data Mesh—domain-oriented ownership, data-as-product, self-serve platforms, and federated governance—becomes the foundation for reliable, reusable ML features and trustworthy models. We’ll map mesh principles to FS use cases—fraud detection, risk, personalization—and show patterns for feature stores, lineage, quality, and access controls that satisfy regulators while accelerating delivery. Attendees will get a pragmatic blueprint: where to start, how to sequence capabilities, metrics that prove value, and pitfalls to avoid on the road from pilots to production.Speaker: Santosh Durgam, Data Engineering & Analytics Leader
Speaker Profile:
Santosh Durgam is a data engineering & analytics leader with 20+ years building governed, high-scale data platforms across retirement/401(k), broader financial services, and healthcare. He leads cross-functional teams that deliver production-grade data lakes, lineage-aware pipelines, and ML-enabled analytics on cloud—translating governance into measurable business outcomes.Recent speaking includes SQL Saturday Minnesota 2025, where he presented “From Ingestion to Insights: Building Robust Data Pipelines in AWS” to an in-person community audience. He has also contributed to international research forums and science conferences, and is invited to speak at ICDPN-2025 (International Conference on Data Processing & Networking), engaging practitioners and scholars on data engineering, governance, and analytics at scale. Santosh actively publishes/curates work via Google Scholar and shares practical playbooks for data quality, metadata/lineage, and operating models that connect data platforms to financial decisioning.
Beyond delivery, Santosh serves the community as a peer reviewer of scholarly work on data/ML methodologies and as a judge/mentor for select industry and academic competitions, reinforcing peer validation and public recognition. He champions modern data culture—mentoring engineers and product leaders, and advocating automation (incl. AI agents) to elevate reliability, speed, and auditability in regulated environments. Santosh recently completed his Executive MBA, sharpening strategy and value-creation at the intersection of data, risk, and growth
The meeting is over Microsoft Teams, and the joining link is https://teams.microsoft.com/l/meetup-join/19%3ameeting_NzZkYWIyOTAtODk1MC00MjVmLWJlNjUtNTRiODZmODA2Zjdh%40thread.v2/0?context=%7b%22Tid%22%3a%22bd9727e8-f539-4c76-983c-6c30130c0bee%22%2c%22Oid%22%3a%229e8d5a64-e773-4ca2-90f6-9a266129171e%22%7d
See you at the meeting!
2 attendees
Past events
23

