What we’re about
Information drives the planet. We organize talks around implementations of information retrieval, in search engines, in recommender systems, or in conversational assistants. Our meetups are usually held on the last Friday of the month, at Science Park Amsterdam. Usually, we have two talks in a row, one industrial, the other academic, 25+5 minutes each, no marketing, just algorithms, followed by drinks. We also host ad hoc "single shot" events whenever an interesting visitor stops and shares their work.
Search Engines Amsterdam is supported by the ELLIS unit Amsterdam.
Follow @irlab_amsterdam on Twitter for the latest updates.
Upcoming events (4+)
See all- SEA: On Offline Evaluation of Recommender SystemsLink visible for attendees
We will host Olivier Jeunen from ShareChat this May in a SEA Single Shot edition (45 min talk + 15 min Q&A). Olivier will discuss his recent work critically examining nDCG as a prevalent offline evaluation metric for top-n recommender systems. Join us online or in person at Lab42 in room L3.36. Note that this SEA meet-up starts an hour earlier than usual at 16:00 CET.
***
IMPORTANT: You can view the Zoom link once you 'attend' the meetup on this page.
***Speaker: Olivier Jeunen (ShareChat)
Title: On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-n Recommendation
Time: 16:00
Abstract: Approaches to recommendation are typically evaluated in one of two ways: (1) via a (simulated) online experiment, often seen as the gold standard, or (2) via some offline evaluation procedure, where the goal is to approximate the outcome of an online experiment. Several offline evaluation metrics have been adopted in the literature, inspired by ranking metrics prevalent in the field of Information Retrieval. (Normalised) Discounted Cumulative Gain (nDCG) is one such metric that has seen widespread adoption in empirical studies, and higher (n)DCG values have been used to present new methods as the state-of-the-art in top-n recommendation for many years. Our work takes a critical look at this approach, and investigates when we can expect such metrics to approximate the gold standard outcome of an online experiment. We formally present the assumptions that are necessary to consider DCG an unbiased estimator of online reward and provide a derivation for this metric from first principles, highlighting where we deviate from its traditional uses in IR. Importantly, we show that normalising the metric renders it inconsistent, in that even when DCG is unbiased, ranking competing methods by their normalised DCG can invert their relative order. Through a correlation analysis between off- and on-line experiments conducted on a large-scale recommendation platform, we show that our unbiased DCG estimates strongly correlate with online reward, even when some of the metric's inherent assumptions are violated. This statement no longer holds for its normalised variant, suggesting that nDCG's practical utility may be limited.SEA Talk #268