Let's Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning

Yinpeng Liu,Jiawei Liu,Xiang Shi,Qikai Cheng,Yong Huang,Wei Lu
2024-06-16
Abstract:Demonstration ordering, which is an important strategy for in-context learning (ICL), can significantly affects the performance of large language models (LLMs). However, most of the current approaches of ordering require high computational costs to introduce the priori knowledge. In this paper, inspired by the human learning process, we propose a simple but effective demonstration ordering method for ICL, named the few-shot In-Context Curriculum Learning (ICCL). The ICCL implies gradually increasing the complexity of prompt demonstrations during the inference process. The difficulty can be assessed by human experts or LLMs-driven metrics, such as perplexity. Then we design extensive experiments to discuss the effectiveness of the ICCL at both corpus-level and instance-level. Moreover, we also investigate the formation mechanism of LLM's ICCL capability. Experimental results demonstrate that ICCL, developed during the instruction-tuning stage, is effective for representative open-source LLMs. To facilitate further research and applications by other scholars, we make the code publicly available.
Computation and Language
What problem does this paper attempt to address?
The paper attempts to address the problem of how to effectively organize examples in In-Context Learning (ICL) to improve the performance of Large Language Models (LLMs). Specifically, inspired by the human learning process, the paper proposes a method called "In-Context Curriculum Learning" (ICCL). This method aims to gradually increase the complexity of prompt examples to help the model better understand the task. The main contributions of the paper include: 1. Proposing ICCL, a simple yet effective example sorting method, and validating its effectiveness for open-source LLMs. 2. Using perplexity as a metric to evaluate example difficulty, which performs better than many existing example sorting methods. 3. The study shows that the ICCL capability of LLMs is developed during the instruction fine-tuning stage. Through experiments, the authors demonstrate that ICCL achieves significant and consistent performance improvements over random baselines and other methods across multiple NLP tasks, especially on large-scale open-source language models. Additionally, the study explores the mechanism of ICCL capability formation, finding that this capability is mainly acquired during the instruction fine-tuning stage.