Learning to Decode Collaboratively with Multiple Language Models

Shannon Shen, Hunter Lang, Bailin Wang, Yoon Kim, David Sontag

2024

PDF Code

Abstract

We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent variable. By optimizing the marginal likelihood of a training set under our latent variable model, the base LLM automatically learns when to generate itself and when to call on one of the ``assistant'' language models to generate, all without direct supervision. Token-level collaboration during decoding allows for a fusion of each model’s expertise in a manner tailored to the specific task at hand. Our collaborative decoding is especially useful in cross-domain settings where a generalist base LLM learns to invoke domain expert models. On instruction-following, domain-specific QA, and reasoning tasks, we show that the performance of the joint system exceeds that of the individual models. Through qualitative analysis of the learned latent decisions, we show models trained with our method exhibit several interesting collaboration patterns, e.g., template-filling.

Type

Conference paper

Publication

ACL 2024

Shannon Shen

PhD Student

My research lies at the intersection between NLP and HCI. I am interested in understanding languages in scientific, legal, or clinical text from documents that are authored and used by domain experts. With newly developed NLP approaches, I study how they can enable better Human-AI collaboration to assist experts in these high-stake settings.

Hunter Lang

PhD Student

Hunter’s research focuses on understanding and improving the performance of machine learning algorithms in the wild, with particular applications in MAP inference for graphical models, stochastic optimization, and weak supervision.

Yoon Kim

Master’s student

Assistant Professor, MIT

Learning to Decode Collaboratively with Multiple Language Models

Abstract

Shannon Shen

PhD Student

Hunter Lang

PhD Student

Yoon Kim

Master’s student

David Sontag

Professor of EECS

Related