Skip to content

PyData Montreal #17: NLP meetup

Photo of Alex Kim
Hosted By
Alex K. and Maria K.
PyData Montreal #17: NLP meetup

Details

"Fine-tuning a large language model without your own supercomputer"
Abstract:
State-of-the-art results in NLP involve more and more the use of a generic pretrained language model fine-tuned on a specific downstream task. This, in turn, has led researchers to develop larger and larger language models, which give better results than their smaller versions but can be hard to use without powerful (and expensive) hardware. In this talk, we will dive into the techniques one can use to reduce the GPU-memory used when fine-tuning a model (gradient accumulation, gradient checkpointing, model parallelism, and the more recent ZeRO/ZeRO-offload developments), and show how to apply them on an example of fine-tuning. In particular, we will see how the variants of ZeRO are currently implemented in the Transformers library, by the use of fairscale and deepspeed.

About Sylvain:
Sylvain Gugger is a Research Engineer at Hugging Face and one of the core maintainers of the Transformers library. Previously, he was a Research Scientist at fast.ai and co-wrote Deep learning for Coders with fastai and PyTorch with Jeremy Howard. The main research focus of his research is to make Deep Learning more accessible, by designing and improving techniques that allow models to train fast on limited resources.
Prior to this, he taught computer science and mathematics in France for seven years. Sylvian is an alumnus of the École Normale Supérieure (Paris, France), where he studied mathematics, and has a master's degree in mathematics from the University of Paris XI (Orsay, France).

---
"Learning to translate with JoeyNMT"
Machine translation is one of the most impactful NLP applications. Despite many toolkits for it being open-sourced, we realized there was something missing: one that is developed for beginners and newcomers to the field. In this talk, we will get to know the JoeyNMT toolkit (https://github.com/joeynmt), a minimalist toolkit for machine translation built on PyTorch. We will learn about the development process, deep dive into the code, and talk about applications.

About Julia:
Julia Kreutzer is a Research Scientist at Google Montreal, where she works on improving machine translation. She holds a PhD in Computational Linguistics from Heidelberg University, Germany, and is also part of the grassroots community Masakhane which aims to build NLP technologies for all of Africa's languages. She deeply cares about open-source, accessibility, and creativity in NLP research.

---
"Understanding digital social trace data via Information Extraction"
Abstract:
I will describe how to efficiently extract information (e.g. phrases, named entities, and categories like topics, sentiment, sarcasm, or abuse) from social media text using semi-supervised, multi-task, and active learning to more accurately identify phrases, named entities, and classification. I will introduce an open source tool called SocialMediaIE. Next, I will show how information extraction from social media data can be looked through the lens of an abstraction called digital social trace data (DSTD) which allows us to utilize the graph structure of the data for developing new information extraction tasks. I will end with ways on improving text classification via features derived from the DSTD representation. Details can be found at: https://shubhanshu.com/phd_thesis/.

About Shubhanshu:
Shubhanshu is a Machine Learning Researcher at Twitter working on the Content Understanding Research team. He finished his Ph.D. on Information Extraction from Digital Social Trace Data with Applications to Social Media and Scholarly Communication Data at the iSchool, University of Illinois at Urbana-Champaign. His current work is at the intersection of machine learning, information extraction, social network analysis, and visualizations. More information about Shubhanshu can be found at: https://shubhanshu.com/

Photo of PyData Montreal group
PyData Montreal
See more events