Chris Gorgolewski is with the dataset search team at Google.
Title: Google Dataset Search: Facilitating data discovery in an open ecosystem
Abstract: There are thousands of data repositories on the Web, providing access to millions of datasets. National and regional governments, scientific publishers and consortia, commercial data providers, and others publish data for fields ranging from social science to life science to high-energy physics to climate science and more. Access to this data is critical to facilitating reproducibility of research results, enabling scientists to build on others’ work, and providing data journalists easier access to information and its provenance. In this talk, I will discuss recently launched Google Dataset Search, which provides search capabilities over potentially all dataset repositories on the Web. I will talk about the open ecosystem for describing and citing datasets that we hope to encourage and the technical details on how we went about building Dataset Search. Finally, I will highlight research challenges in building a vibrant, heterogeneous, and open ecosystem where data becomes a first-class citizen.