Layering Search Algorithms and Underperforming Torso Queries

This is a past event

41 people went

Details

As usual, we will have two talks followed by drinks. The industry talk will be given by Erik Groeneveld from Seecr (http://seecr.nl/). He will talk about layering search algorithms for satisfying very complicated search requirements. The academic talk will be given by Masrour Zoghi from UvA on click-based hot fixes for underperforming Torso queries. The abstracts will be announced later.

Erik Groeneveld (Seecr)--Layering Search Algorithms to Satisfy Complicated Information Needs

This talk presents the way various search algorithms (eight) take consecutive turns as to gradually work from a large corpus towards a few precisely ranked hits. The case comes from a real world search engine implemented by the Dutch Royal Library. The talk presents the current affairs and demonstrates results. The presenter wants to share his approach and hopes to get feedback on it.

Erik Groeneveld's bio: I am a distributed systems and information retrieval expert. I create and implement efficient and scalable search engines using the latest algorithms. I do so with the help of the people at Seecr who share my passion.
Open source software means to me the ability to execute: being able to download it right now and fix any problem that appears allows me to work fast and meet my goals. I introduced XP in The Netherlands and initiated user groups and conferences about Agile. I am now a post-agilist: I believe adventures succeed not because of the method but because of the people. I have no other objectives in live than being happy now and old and wise later, knowing that only one of these three happens naturally. In 2015 I am building a house out of hemp and lime (hempcrete). @hennephuis (in Dutch)

Masrour Zoghi (University of Amsterdam)--Click-based Hot Fixes for Underperforming Torso Queries

Ranking documents using their historical click-through rate (CTR) can improve relevance for frequently occurring queries, i.e., so-called head queries. It is difficult to use such click signals on non-head queries as they receive fewer clicks. In this work, we address the challenge of dealing with torso queries on which the production ranker is performing poorly. Torso queries are queries that occur frequently enough so that they are not considered as tail queries and yet not frequently enough to be head queries either. They comprise a large portion of most commercial search engines' traffic, so the presence of a large number of underperforming torso queries can harm the overall performance significantly. We propose a practical method for dealing with such cases, drawing inspiration from the literature on learning to rank (LTR). Our method requires relatively few clicks from users to derive a strong re-ranking signal by comparing document relevance between pairs of documents instead of using absolute numbers of clicks per document. By infusing a modest amount of exploration into the ranked lists produced by a production ranker and extracting preferences between documents, we obtain substantial improvements over the production ranker in terms of page-level online metrics. We use an exploration dataset consisting of real user clicks from a large-scale commercial search engine to demonstrate the effectiveness of the method. We conduct further experimentation on public benchmark data using simulated clicks to gain insight into the inner workings of the proposed method. Our results indicate a need for LTR methods that make more explicit use of the query and other contextual information.

Masrour Zoghi's bio: Masrour Zoghi is a graduate student in the ILPS group at the University of Amsterdam, working on bandits, ranking and preference learning.