Lucas Beyer | Learning General Visual Representations
Details
Virtual London Machine Learning Meetup - 09.03.2022 @ 18:30
We would like to invite you to our next Virtual Machine Learning Meetup.
Agenda:
- 18:25: Virtual doors open
- 18:30: Talk
- 19:10: Q&A session
- 19:30: Close
Sponsors
https://evolution.ai/ : Machines that Read - Intelligent data extraction from corporate and financial documents.
- Title: Learning General Visual Representations (Lucas Beyer is a Researcher at Google Brain Zurich)
Abstract: In the quest for the best generic visual representation ("vision backbone"), I have landed at large-scale pre-training and transfer. This talk will walk through some highlights of this journey, starting at a clear definition of the setting (Visual Task Adaptation Benchmark - VTAB, arxiv.org/abs/1910.04867), going in-depth on our first breakthrough result in large-scale pre-training (Big Transfer - BiT, arxiv.org/abs/1912.11370), and a more recent one applying the transformer architecture to images (Vision Transformer - ViT, arxiv.org/abs/2010.11929) as well as wondering, whether this all is actually meaningful anymore (Are we done with ImageNet, arxiv.org/abs/2006.07159). I will briefly mention important progress in making large models actually practically usable via knowledge distillation (Patient and Consistent teacher, arxiv.org/abs/2106.05237), and conclude by introducing a recent new alternative to the typical transfer-learning approach, (Locked image-Text, or LiT-tuning arxiv.org/abs/2111.07991)
Bio: Lucas grew up in Belgium wanting to make video games and their AI, went on to study mechanical engineering at RWTH Aachen in Germany, did a PhD in robotic perception/computer vision there too, and is now researching representation learning and vision backbones at Google Brain in Zürich.




