Many-Shot In-Context Learning

Rishabh Agarwal,Avi Singh,Lei M. Zhang,Bernd Bohnet,Luis Rosias,Stephanie Chan,Biao Zhang,Ankesh Anand,Zaheer Abbas,Azade Nova,John D. Co-Reyes,Eric Chu,Feryal Behbahani,Aleksandra Faust,Hugo Larochelle
2024-05-23
Abstract:Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative and discriminative tasks. While promising, many-shot ICL can be bottlenecked by the available amount of human-generated examples. To mitigate this limitation, we explore two new settings: Reinforced and Unsupervised ICL. Reinforced ICL uses model-generated chain-of-thought rationales in place of human examples. Unsupervised ICL removes rationales from the prompt altogether, and prompts the model only with domain-specific questions. We find that both Reinforced and Unsupervised ICL can be quite effective in the many-shot regime, particularly on complex reasoning tasks. Finally, we demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases, can learn high-dimensional functions with numerical inputs, and performs comparably to fine-tuning. Our analysis also reveals the limitations of next-token prediction loss as an indicator of downstream ICL performance.
Machine Learning,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The main problem this paper attempts to address is the performance of large-scale examples (many-shot) in in-context learning (ICL) by extending the context window and exploring how to overcome the reliance on high-quality human-generated data in in-context learning. Specifically, the paper focuses on the following aspects: 1. **Scaling up in-context learning**: Traditional in-context learning mainly focuses on a few-shot examples, but with the increase in the context window of large language models (LLMs), hundreds or thousands of examples can be used for learning. The paper systematically evaluates the performance improvements of different tasks under many-shot conditions. 2. **Improving performance on complex tasks**: The paper demonstrates significant performance improvements in many-shot in-context learning on various generative and discriminative tasks, especially on complex reasoning tasks. 3. **Reducing reliance on human-generated data**: To overcome the need for a large amount of high-quality human-generated data in many-shot in-context learning, the paper introduces two methods: - **Reinforced ICL**: Using model-generated chain-of-thought reasoning instead of human-generated reasoning. - **Unsupervised ICL**: Completely removing reasoning and only providing domain-specific inputs. 4. **Analyzing the dynamics of many-shot in-context learning**: The paper explores the learning dynamics of in-context learning from few-shot to many-shot, including how to overcome pre-training biases and solve high-dimensional prediction tasks. 5. **Evaluating the effectiveness of many-shot in-context learning**: Through experiments on multiple tasks (such as machine translation, summarization, planning, code verification, etc.), the paper demonstrates the effectiveness of many-shot in-context learning and compares it with fine-tuning methods. In summary, this paper aims to enhance the adaptability and generality of large language models by scaling up in-context learning, improving performance on complex tasks, and exploring methods to reduce reliance on high-quality human-generated data.