Skip to content

Slaying data quality issues with dbt core, Datafold's data-diff and Elementary

L
Hosted By
Lennart D. and Annieke H.
Slaying data quality issues with dbt core, Datafold's data-diff and Elementary

Details

Slaying data quality issues with dbt core, Datafold's data-diff and Elementary.

Whether you are a data scientist wondering why you have duplicates in your dataset, or you’re an engineer dealing with missing values: we have all suffered from at least a few data quality battles. In our next Meetup, Chiel Fernhout (DevOps engineer at Datafold) and Sebas Higler (Data Engineer at BigData Republic) will arm you with open-source weapons to fight these data quality monsters!

In the first talk, Chiel will introduce you to Datafold’s data-diff. Data-diff checks every change to a data pipeline and highlights how the change in source code will affect the data produced by the pipeline. Chiel will show you how the tool works and how it can be integrated with dbt to support Test-Driven Data Development. In particular, he will highlight how you can use data-diff in your CI workflow to deal with data quality issues.

After that, Sebas will discuss several ways to monitor data using dbt Core. He will cover Elementary, an open source Data Observability tool and dbt third-party package. This tool seamlessly integrates with dbt and lets you easily run and inspect a variety of data quality checks.
Like all of our sessions, we’ll have some nice food before the session and drinks afterwards!

When
Thursday the 25th of May 2023

Where
BigData Republic Office, Coltbaan 4C, Nieuwegein

Register now
Please fill in the registration form on our website to let us know if you'll be joining us for dinner, to prevent food waste.
Registration form

Agenda

  • 17:45 Walk-in + food
  • 18.30 Datafold's data-diff by Chiel Fernhout
  • 19.00 Questions
  • 19.10 Elementary by Sebas Higler
  • 19.40 Questions + Discussion
  • 19.55 Drinks & snacks

About Chiel Fernhout
Chiel has seen many sides of the tech world. He started in the backend, moved to full stack Machine Learning (ML) Engineering and finally transitioned to DevOps. As a DevOps Engineer at Datafold he creates great testing automation tools for Data Engineers. In general, he tries to automate himself away ;)

About Sebas Higler
Sebas has a background in both Software Engineering and Artificial Intelligence. He enjoys working with and thinking about data problems. In his current role as Data Engineer at BigData Republic he's involved in a project at IKEA using dbt to implement a new datamart.

We look forward to welcoming you to our next event! In case you have any questions, please contact Lennart (lennart.damen@bigdatarepublic.nl).

Photo of Data Engineering NL group
Data Engineering NL
See more events
BigData Republic
Coltbaan 4C · Nieuwegein, UT