in-person | Meta AI | Language Models Can Teach Themselves to Use Tools

Name: *in-person* | Meta AI | Language Models Can Teach Themselves to Use Tools
Start: 2023-06-07T18:00:00+01:00
End: 2023-06-07T20:00:00+01:00
Location: Whittington House

Hosted By

Martin G.

*in-person* | Meta AI | Language Models Can Teach Themselves to Use Tools

Details

Please note that RSVPs will close on Monday 5th June at noon to allow our venue enough time to make security arrangements - to avoid disappointment, register as soon as possible.

Title: Language Models Can Teach Themselves to Use Tools
Paper: https://arxiv.org/abs/2302.04761
Abstract: Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds: We introduce Toolformer, a model trained in a self-supervised way to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction, while requiring nothing more than a handful of demonstrations for each API. We incorporate a range of tools, including a calculator, a Q&A system, two different search engines, a translation system, and a calendar. Toolformer achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, and does not sacrifice performance on its core language modeling task.

Speaker bios: Jane Dwivedi-Yu is a researcher at Meta AI. Her current research focuses on enhancing capabilities of language models along several dimensions, including tool usage, editing, and evaluating representation harms and notions of morality and norms internalized by these models. She is also interested in building large-scale personalized recommender systems by leveraging principles from affective computing, work which was cited among the top 15 AI papers to read in 2022. Before joining Meta, she completed her PhD in Computer Science at University of California, Berkeley and Bachelors at Cornell University.

Nicola Cancedda is a research scientist manager with Meta's Fundamental AI Research (FAIR) organisation. His current research focus is on expanding the capabilities of foundation models to safely interact with the world. Nicola is an alumnus of the University of Rome "La Sapienza", and has held applied and fundamental research and management positions at Meta, Xerox, and Microsoft, pushing the state of the art in tasks in Machine Learning, Machine Translation, and Natural Language Processing, and leading the transfer of research results to large-scale production environment.

Agenda: