Annotative Indexing


Details
Another edition of SEA XL: Charles Clarke will join us on location and present work on annotative indexing.
This is a hybrid event. The speaker and on-site audience can be found at Lab42, Room L3.33. The Zoom link, in case you want to join online, is visible once you "attend" the meetup on this page.
Speaker: Charles Clarke, University of Waterloo
Title: Annotative Indexing.
Abstract: This talk presents and explores annotative indexing, a novel framework that unifies and generalizes traditional inverted indices, column stores, object stores, and graph databases. As a result, annotative indexing can provide the underlying indexing framework for retrieval systems that integrate sparse retrieval, dense retrieval, entity retrieval, knowledge graphs, and semi-structured data. While our reference implementation primarily supports human language data in the form of text, annotative indexing is sufficiently general to support a wide range of other data types. The talk will include examples of SQL-like queries over a JSON store built on our reference implementation that include numbers and dates. Taking advantage of the flexibility of annotative indexing, the talk will also demonstrate a fully dynamic inverted index incorporating support for ACID properties of transactions with hundreds of multiple concurrent readers and writers.
Bio: Charles Clarke is a Professor in the School of Computer Science and an Associate Dean for Innovation and Entrepreneurship at the University of Waterloo, Canada. His research focuses on data intensive tasks involving human language data, including search, ranking, and question answering. Clarke is an ACM Distinguished Scientist and leading member of the search and information retrieval community. From 2013 to 2016 he served as the Chair of the Executive Committee for the ACM Special Interest Group on Information Retrieval (SIGIR). From 2010-2018 he was Co-Editor-in-Chief of the Information Retrieval Journal. He was Program Co-Chair for the SIGIR main conference in 2007 and 2014, and he was elected to the SIGIR Academy in 2022. His research has been funded by Google, Microsoft, Meta, Spotify, and other companies and granting agencies. Along with Mark Smucker, he received the SIGIR 2012 Best Paper Award. Along with colleagues, he received the SIGIR 2019 Test of Time Award for their SIGIR 2008 paper on novelty and diversity in search. In 2006 he spent a sabbatical at Microsoft, where he was involved in the development of what is now the Bing search engine. From August 2016 to August 2018, while on leave from Waterloo, he was a Software Engineer at what is now Meta, where he worked on metrics and ranking for Facebook Search. He is a co-author of the textbook Information Retrieval: Implementing and Evaluating Search Engines, MIT Press, 2010, which he has had the pleasure of seeing entirely deprecated in recent years. Almost.
Counter: Oh yes, this is SEA Talk #274.

Annotative Indexing