How do you test data workflows?


Details
N.B.: Event will be on the 6th floor.
Ever deploy new code for a 10-hour data pipeline that breaks in the middle because you forgot a close-parenthesis somewhere? I have and it hurts. Other software disciplines can write code tests - why can't we?
But wait, how do you even unit test a SQL query?
In this talk, Paul will introduce ways to approach testing your data code, looking at traditional tests, static analysis, and data quality checks. We'll discuss useful tools like pre-commit, SQLFluff, and Great Expectations. And with just a little bit of work, you too can make your data code much more robust.
About the speaker
Paul Anzel is a senior data engineer with Sentry.io, where he uses dbt daily to clean and prepare customer and business data for BizOps, sales, and marketing. He has previously worked for HEB, Metromile, and Wiser. Paul is a volunteer for the NumFocus Affiliate Project Selection Committee and the SciPy conference.
COVID-19 safety measures

How do you test data workflows?