Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models

Jiaao Chen,Xiaoman Pan,Dian Yu,Kaiqiang Song,Xiaoyang Wang,Dong Yu,Jianshu Chen
2024-07-17
Abstract:We investigate how to elicit compositional generalization capabilities in large language models (LLMs). Compositional generalization empowers LLMs to solve complex problems by combining foundational skills, a critical reasoning ability akin to human intelligence. However, even the most advanced LLMs currently struggle with this form of reasoning. We examine this problem within the framework of in-context learning and find that demonstrating both foundational skills and compositional examples grounded in these skills within the same prompt context is crucial. We refer to this prompt structure as skills-in-context (SKiC). With as few as two exemplars, this in-context learning structure enables LLMs to tackle more challenging problems requiring innovative skill combinations, achieving near-perfect systematic generalization across a broad range of tasks. Intriguingly, SKiC also unlocks the latent potential of LLMs, allowing them to more actively utilize pre-existing internal skills acquired during earlier pretraining stages to solve complex reasoning problems. The SKiC structure is robust across different skill constructions and exemplar choices and demonstrates strong transferability to new tasks. Finally, inspired by our in-context learning study, we show that fine-tuning LLMs with SKiC-style data can elicit zero-shot weak-to-strong generalization, enabling the models to solve much harder problems directly with standard prompting.
Computation and Language
What problem does this paper attempt to address?
This paper attempts to address the problem of the lack of ability of large - language models (LLMs) in compositional generalization. Specifically, although existing LLMs perform excellently in handling natural language processing (NLP) tasks, they still have difficulty in solving more complex new problems by combining existing basic skills. The core objective of the paper is to study how to use the in - context learning method to enable LLMs to perform better combinatorial reasoning, thereby solving complex and unseen problems. ### Main contributions of the paper 1. **Proposing the Skills - in - Context (SKiC) structure**: - **Definition and structure**: SKiC is a new prompt structure that includes three main parts: 1. **Basic skills**: Lists the basic skills required to solve complex tasks. 2. **Combination examples**: Shows specific examples of how to combine these basic skills to solve complex problems. 3. **Problem to be solved**: The actual problem that needs to be solved. - **Function**: By showing basic skills and their combination methods, SKiC helps LLMs explicitly link reasoning steps with basic skills in the context, thereby achieving stronger compositional generalization ability. 2. **Experimental verification**: - **Systematic generalization**: SKiC achieves near - perfect systematic generalization on a series of tasks, such as letter splicing, addition, multiplication, and dynamic programming tasks. - **Complex reasoning**: For tasks that need to call internal skills in pre - trained knowledge (such as GSM8K and MATH), SKiC also shows significant advantages. Even if the provided skills are incomplete, LLMs can effectively use internal skills for reasoning. 3. **Beyond in - context learning**: - **Fine - tuning effect**: Inspired by the SKiC structure, using SKiC - annotated data to fine - tune LLMs can further improve their generalization ability from simple to complex tasks, which is better than the traditional CoT method. 4. **Robustness and transferability**: - **Robustness**: SKiC has strong robustness to different skill construction and example selection. - **Transferability**: SKiC performs well in cross - task transfer. Even if the prompt originally designed for one task is applied to a new task, the performance is also better than that of traditional methods. ### Summary By introducing the Skills - in - Context (SKiC) prompt structure, this paper successfully addresses the challenges of LLMs in compositional generalization, enabling LLMs to combine basic skills more effectively in the in - context learning framework and solve complex problems. This method not only improves the generalization ability of the system but also shows that LLMs can more actively use the internal skills obtained in the pre - training stage, thereby achieving better performance on various tasks.