Meta AI | Language Models Can Teach Themselves to Use Tools

4.26K subscribers

593 views

About
Share

Published On Jul 20, 2023

Sponsored by Evolution AI: https://www.evolution.ai

Abstract: Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds: We introduce Toolformer, a model trained in a self-supervised way to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction, while requiring nothing more than a handful of demonstrations for each API. We incorporate a range of tools, including a calculator, a Q&A system, two different search engines, a translation system, and a calendar. Toolformer achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, and does not sacrifice performance on its core language modeling task.

Speaker bios: Jane Dwivedi-Yu is a researcher at Meta AI. Her current research focuses on enhancing capabilities of language models along several dimensions, including tool usage, editing, and evaluating representation harms and notions of morality and norms internalized by these models. She is also interested in building large-scale personalized recommender systems by leveraging principles from affective computing, work which was cited among the top 15 AI papers to read in 2022. Before joining Meta, she completed her PhD in Computer Science at University of California, Berkeley and Bachelors at Cornell University.

Published On Jul 20, 2023

Share/Embed

Video Link