Abstract:It has been shown that Large Language Models' (LLMs) performance can be improved for many tasks using Chain of Thought (CoT) or In-Context Learning (ICL), which involve demonstrating the steps needed to solve a task using a few examples. However, while datasets with input-output pairs are relatively easy to produce, providing demonstrations which include intermediate steps requires cumbersome manual work. These steps may be executable programs, as in agentic flows, or step-by-step reasoning as in CoT. In this work, we propose Automatic Data Labeling and Refinement (ADLR), a method to automatically generate and filter demonstrations which include the above intermediate steps, starting from a small seed of manually crafted examples. We demonstrate the advantage of ADLR in code-based table QA and mathematical reasoning, achieving up to a 5.5% gain. The code implementing our method is provided in the Supplementary material and will be made available.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the challenges encountered in improving task performance through in - context learning (ICL) in large - language models (LLMs). Specifically, although showing the steps required to solve a task (for example, using the chain of thought (CoT) of several examples or executing a program) can significantly improve the performance of LLMs on a variety of tasks, generating a high - quality set of examples containing intermediate steps requires a great deal of manual work, which is both time - consuming and labor - intensive. To overcome this challenge, the authors propose the Automatic Data Labeling and Refinement (ADLR) method. ADLR aims to automatically generate and screen demonstration examples containing the above - mentioned intermediate steps starting from a small number of hand - crafted examples. Through this method, the paper demonstrates the advantages of ADLR in code - based table question - answering (table QA) and mathematical reasoning tasks, achieving a performance improvement of up to 5.5%. The ADLR method is mainly divided into three steps: 1. **Generate a large number of examples**: Start with a data set containing inputs and final answers, and use the initial hand - crafted context to generate intermediate data for these samples. Ensure the correctness of the generated intermediate data by verifying whether it can lead to the correct final answer. This step provides a complete set of solved examples as well as a set of unsolved (difficult) samples. 2. **Filter and refine examples**: Refine the set of solved examples according to two criteria. First, estimate the difficulty of the sample, that is, the proportion of correctly solving the problem after running the LLM multiple times with a non - zero temperature under the same input. Then, select those examples that can solve many difficult samples in a single prompt to test their utility. 3. **Use selected examples for ICL**: Use the refined set of examples to enhance the reasoning protocol of the underlying algorithm. By using multiple diverse contexts, each containing a random subset of examples and a large number of examples, and finally aggregating the results of multiple LLM runs through majority voting. Through this method, ADLR not only improves the performance of existing algorithms in the ICL mode, but also provides a simple and effective method for generating and screening high - quality training examples, thereby further promoting the application and development of LLMs in complex tasks.

Augmenting In-Context-Learning in LLMs via Automatic Data Labeling and Refinement

Are Human-generated Demonstrations Necessary for In-context Learning?

Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data

ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought

Conceptual In-Context Learning and Chain of Concepts: Solving Complex Conceptual Problems Using Large Language Models

Large Language Model-Aware In-Context Learning for Code Generation

Task-Level Thinking Steps Help Large Language Models for Challenging Classification Task

Effective Demonstration Annotation for In-Context Learning via Language Model-Based Determinantal Point Process

Improving In-Context Learning with Small Language Model Ensembles

Enhancing In-Context Learning via Implicit Demonstration Augmentation

Take One Step at a Time to Know Incremental Utility of Demonstration: An Analysis on Reranking for Few-Shot In-Context Learning

LLMs Are In-Context Reinforcement Learners

Let's Learn Step by Step: Enhancing In-Context Learning Ability with Curriculum Learning

AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations

Demonstration Augmentation for Zero-shot In-context Learning

TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools

Large Language Models Know What Makes Exemplary Contexts

Does In-Context Learning Really Learn? Rethinking How Large Language Models Respond and Solve Tasks via In-Context Learning

Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning