Test-Driven Data Engineering Fundamentals Using DBT and Python


Details
NOTE: This will be a hybrid online and in-person event and we are working through the logistics with the ever-changing COVID situation.. More details to come when the date is closer...
Agenda:
6:00 doors open for networking
6:30 food + networking
7:00 welcome + announcements
7:10 main talk
Abstract:
In the field of data science and engineering, automated testing is far from commonplace. One way to commit to tackling the goal of using automated testing is to identify methods for following test-driven development (TDD). Test-driven data engineering (TDDE) is special because it is often very complex to build automated tests due to lack of frameworks, accessibility to tool internals, or even localized environments. In addition, new tools and technologies are frequently introduced to the environment that abstract away the ability to develop automated tests for code that touches data. With data being one of an organization’s most critical assets, doesn’t it make sense to invest in the skillsets to achieve quantifiable quality for their data?
In this presentation and demonstration, you will learn some of the fundamental patterns to building automated tests as part of the data engineering process, that would be part of a CI/CD pipeline. DBT (https://www.getdbt.com/) and python will be used as examples to demonstrate different types of data engineering use cases that require creative thinking to achieve TDDE. Topics covered will start with use of DBT and python with Docker and SQL Server. Strategies around Spark, Snowflake, orchestration, and applying the concepts to other technologies may be discussed/demonstrated.
Donald Sawyer is the Director of Data Engineering at Object Partners. He has many years of experience applying software engineering skills like testing, scrum, UX, and architecture design, to data science and engineering. He also built and has taught the course, "Big Data Engineering and Architecture" at the University of Minnesota and St. Cloud State University for the past five years. He has built numerous TDDE frameworks for clients and has given many talks on the quality side of data engineering.

Test-Driven Data Engineering Fundamentals Using DBT and Python