Out-of-Distribution Detection at Scale (Stanislav Fořt)


Details
Speaker:
Stanislav Fořt (Research Scientist & Member of Technical Staff at Anthropic; Ph.D. from Stanford University)
Abstract:
Near out-of-distribution detection (OOD) poses a major challenge for deep neural networks. Large models trained on massive datasets have been performing exceptionally well on a host of performance measures, many of which they have not been explicitly trained for. We demonstrate that large-scale pre-trained transformers can significantly improve the state-of-the-art (SOTA) on a range of near OOD tasks across different data modalities. For instance, on the challenging CIFAR-100 vs CIFAR-10 OOD detection, we improve the AUROC from 85% (previous SOTA) to 98% using Vision Transformers pre-trained on ImageNet-21k (our estimate for human performance lies around 96%). On a challenging genomics OOD detection benchmark, we improve the AUROC from 66% to 77% using transformers and unsupervised pre-training. To further improve performance, we explore the few-shot outlier exposure setting where a few examples from outlier classes may be available; we show that pre-trained transformers are particularly well-suited for outlier exposure. For multi-modal image-text pre-trained transformers such as CLIP, we explore a new way of using just the names of outlier classes as a sole source of information without any accompanying images, and show that this outperforms previous SOTA on standard vision OOD benchmark tasks. Our results fit well within the empirical trend of significant benefits of scale and data for many machine learning tasks. Despite the success of these methods, we demonstrate they are unfortunately still extremely brittle to targeted adversarial attacks, showing that the story of near-OOD detection is not completely yet.
Key paper:
"Exploring the Limits of Out-of-Distribution Detection", Stanislav Fort, Jie Ren, Balaji Lakshminarayanan. Advances in Neural Information Processing Systems 34 (2021): 7068-7081.
https://arxiv.org/abs/2106.03004

Out-of-Distribution Detection at Scale (Stanislav Fořt)