June 14, 2012 · 7:00 PM
This will be a special meeting held together with the Chicago Python User Group (http://chipy.org/) focused on Open Data and efforts underway that utilize Python heavily to process this data.
Here's the agenda:
Python Open Data Summit Intro
Open Government Data Movement Overview
The history and goals of the open government data movement nationally and in Chicago, previous commercial uses of open data, such as weather data, and a couple of contemporary examples of how cities and independent groups are using open data.
Big Data De-Duping
Forest Gregg, Derek Eder
Derek Eder of Webitects and Forest Gregg, a Ph.D. student of sociology at the U of C, will describe the Python library they are developing to deduplicate tabular data, quickly, accurately, and at a large scale. The library facilitates the matching of related records in different data sets, using a machine learning approach. They expect to have a demo to show and will explain how they expect that the library will be used.