Measuring, Maintaining and Improving Data Quality

Details
What You'll Learn
Data quality is a fundamental issue for data science that can be measured in many different ways. Whether the data is stored in a relational database, a hierarchical file, or in key/value pairs, the reliability of your analysis depends on the quality of the data involved. In this talk, Joe Sremack will discuss what data quality is, the categories in which data quality can be measured, and common data quality issues. Sample data and Python code will be shown to illustrate what data quality issues are and how they can be identified.
Our Speaker
Joe Sremack is a Director in Berkeley Research Group’s Technology Services Group. He has spent over a decade performing digital investigations for some of the largest civil and criminal cases in US History, such as the Bernie Madoff and Allen Stanford Ponzi scheme investigations. His work involves the collection and analysis of organizations’ data, where data quality is a frequent issue
Agenda
6:30 - Food and drinks
7:00 - Intro
7:10 - Talk
8:30 - Drinks at Tonic

Measuring, Maintaining and Improving Data Quality