Multilingual NLP: Zero-shot Cross-lingual Transfer


Details
Junjie Hu from the University of Wisconsin-Madison will join us to present on work at the intersection of human and machine intelligence that builds on his recently completed PhD at Carnegie Mellon.
Summary:
Over the last decade, the phenomenal success of NLP systems has been mostly driven by deep neural networks and supervised machine learning approaches on a large amount of labeled data. However, it is infeasible to annotate data under all possible real-world scenarios. As a result, these systems may fail dramatically in practice when dealing with complex textual data written in different languages, or even associated with different data modalities.
In this talk, I will present work on two distinct aspects that are important to extend the generalization ability of NLP systems. First, I will present my work on XTREME that provides a platform for cross-lingual learning on 9 NLP tasks over 40 languages. I will then introduce a training technique for learning multilingual representations for words and sentences. Finally, I will present our work on pre-training of vision-language models for multilingual text-to-video search under the zero-shot cross-lingual setting. This talk will be concluded with an overview of my research and my research plans in the interdisciplinary field of AI and data science.
Bio:
Junjie Hu is an Assistant Professor at the University of Wisconsin-Madison. He obtained his Ph.D. in Computer Science at Carnegie Mellon University, under the supervision of Prof. Jaime Carbonell and Prof. Graham Neubig. His research lies at the intersection of natural language processing and machine learning.
Jana Thompson (https://www.luxzia.ai/) will be the guest host for this month's event!
Not required reading in advance, here are three papers that Junjie will be drawing on in this talk:

Multilingual NLP: Zero-shot Cross-lingual Transfer