[UCL WI Talk]: Stopping Methods for Technology-Assisted Review
Details
Abstract:
Technology-assisted Review (TAR) supports screening document collections for relevance. TAR is often applied in situations where high recall is important, such as systematic reviews of medical evidence or responses to legal disclosure requests. Stopping methods provide reviewers with information about when they can safely stop examining documents, thereby reducing the effort required to review a collection.
We propose a novel stopping method based on point processes. The approach makes use of rate functions to model the occurrence of relevant documents in a ranking, and four candidate functions are compared. Evaluation is carried out using standard datasets (CLEF e-Health, TREC Total Recall, TREC Legal). Results show that the proposed method can achieve a desired level of recall without requiring an excessive number of documents to be examined and also compares well against multiple alternative approaches. Robustness of the proposed method is also explored by evaluating it using multiple rankings of various effectiveness. The approach is extended by integrating a text classifier thereby further reducing the number of documents that need to be examined.
Bio:
Mark Stevenson is a member of Sheffield University’s Natural Language Processing group and a Senior Lecturer in Computer Science. His research focuses on the development of systems to extract knowledge from text and assist users in accessing this information. He publishes on a range of topics within Natural Language Processing and Information Retrieval including Word Sense Disambiguation, Information Extraction, identification of text reuse, Literature Based Discovery, Technology-Assisted Review and scholarly document processing. He has published over 150 papers in peer-reviewed journals and conferences. He is a member of the EPSRC Peer Review College (2006-present), served as area chair for multiple conferences (EACL 2023, EACL 2021, ACL 2019, CoNLL 2019 and EACL 2017), is a standing reviewer for TACL and has been an editorial board member of Computational Linguistics. He has been PI for projects funded by the EPSRC, EU, NIHR, DSTL and Google.
