Abstract:Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative and discriminative tasks. While promising, many-shot ICL can be bottlenecked by the available amount of human-generated examples. To mitigate this limitation, we explore two new settings: Reinforced and Unsupervised ICL. Reinforced ICL uses model-generated chain-of-thought rationales in place of human examples. Unsupervised ICL removes rationales from the prompt altogether, and prompts the model only with domain-specific questions. We find that both Reinforced and Unsupervised ICL can be quite effective in the many-shot regime, particularly on complex reasoning tasks. Finally, we demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases, can learn high-dimensional functions with numerical inputs, and performs comparably to fine-tuning. Our analysis also reveals the limitations of next-token prediction loss as an indicator of downstream ICL performance.

What problem does this paper attempt to address?

The main problem this paper attempts to address is the performance of large-scale examples (many-shot) in in-context learning (ICL) by extending the context window and exploring how to overcome the reliance on high-quality human-generated data in in-context learning. Specifically, the paper focuses on the following aspects: 1. **Scaling up in-context learning**: Traditional in-context learning mainly focuses on a few-shot examples, but with the increase in the context window of large language models (LLMs), hundreds or thousands of examples can be used for learning. The paper systematically evaluates the performance improvements of different tasks under many-shot conditions. 2. **Improving performance on complex tasks**: The paper demonstrates significant performance improvements in many-shot in-context learning on various generative and discriminative tasks, especially on complex reasoning tasks. 3. **Reducing reliance on human-generated data**: To overcome the need for a large amount of high-quality human-generated data in many-shot in-context learning, the paper introduces two methods: - **Reinforced ICL**: Using model-generated chain-of-thought reasoning instead of human-generated reasoning. - **Unsupervised ICL**: Completely removing reasoning and only providing domain-specific inputs. 4. **Analyzing the dynamics of many-shot in-context learning**: The paper explores the learning dynamics of in-context learning from few-shot to many-shot, including how to overcome pre-training biases and solve high-dimensional prediction tasks. 5. **Evaluating the effectiveness of many-shot in-context learning**: Through experiments on multiple tasks (such as machine translation, summarization, planning, code verification, etc.), the paper demonstrates the effectiveness of many-shot in-context learning and compares it with fine-tuning methods. In summary, this paper aims to enhance the adaptability and generality of large language models by scaling up in-context learning, improving performance on complex tasks, and exploring methods to reduce reliance on high-quality human-generated data.

Many-Shot In-Context Learning

Many-Shot In-Context Learning in Multimodal Foundation Models

In-Context Learning for Text Classification with Many Labels

Large Language Models Know What Makes Exemplary Contexts

Revisiting In-Context Learning with Long Context Language Models

Does In-Context Learning Really Learn? Rethinking How Large Language Models Respond and Solve Tasks via In-Context Learning

Implicit In-context Learning

Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context Evaluation

Focused Large Language Models are Stable Many-Shot Learners

Many-Shot In-Context Learning for Molecular Inverse Design

"In-Context Learning" or: How I learned to stop worrying and love "Applied Information Retrieval"

One size doesn't fit all: Predicting the Number of Examples for In-Context Learning

Can Many-Shot In-Context Learning Help LLMs as Evaluators? A Preliminary Empirical Study

ParaICL: Towards Robust Parallel In-Context Learning

In-Context Learning with Iterative Demonstration Selection

Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

C-ICL: Contrastive In-context Learning for Information Extraction

The broader spectrum of in-context learning

Revisiting In-context Learning Inference Circuit in Large Language Models

Link-Context Learning for Multimodal LLMs