The bottleneck to discovery is no longer a lack of data but an inability to manage, analyze, and share ever larger datasets. Individual researchers can no longer download and analyze the important datasets in their scientific fields on their own computers. Cross-disciplinary analysis is even more difficult.
Managing long-term active data preservation and reproducibility of data analysis are both important to the scientific community. A computational infrastructure or
“data commons” is needed to deal with big data. In this talk, Maria Patterson, will bring the perspective of her experience with building a "data commons” for scientists and researchers.
Maria Patterson earned a PhD in Astronomy. Being frustrated with dealing with large astronomical data, she joined the University of Chicago's Center for Data Intensive Science where she is Scientific Lead for the Open Science Data Cloud. The center is an interdisciplinary community science cloud supporting researchers with data-intensive projects.
Maria also works with the Open Commons Consortium (OCC) on Project Matsu, a collaboration with NASA to make satellite data and automated analysis pipeline products available via the cloud. Maria is also working with the OCC efforts with the NOAA Big Data Project, working to increase usage and make NOAA's vast environmental data archives more broadly available in the cloud.