Abstract:Pretrained large language models (LLMs) are currently state-of-the-art for solving the vast majority of natural language processing tasks. While many real-world applications still require fine-tuning to reach satisfactory levels of performance, many of them are in the low-data regime, making fine-tuning challenging. To address this, we propose LLM2LLM, a targeted and iterative data augmentation strategy that uses a teacher LLM to enhance a small seed dataset by augmenting additional data that can be used for fine-tuning on a specific task. LLM2LLM (1) fine-tunes a baseline student LLM on the initial seed data, (2) evaluates and extracts data points that the model gets wrong, and (3) uses a teacher LLM to generate synthetic data based on these incorrect data points, which are then added back into the training data. This approach amplifies the signal from incorrectly predicted data points by the LLM during training and reintegrates them into the dataset to focus on more challenging examples for the LLM. Our results show that LLM2LLM significantly enhances the performance of LLMs in the low-data regime, outperforming both traditional fine-tuning and other data augmentation baselines. LLM2LLM reduces the dependence on labor-intensive data curation and paves the way for more scalable and performant LLM solutions, allowing us to tackle data-constrained domains and tasks. We achieve improvements up to 24.2% on the GSM8K dataset, 32.6% on CaseHOLD, 32.0% on SNIPS, 52.6% on TREC and 39.8% on SST-2 over regular fine-tuning in the low-data regime using a Llama-2-7B student model. Our code is available at <a class="link-external link-https" href="https://github.com/SqueezeAILab/LLM2LLM" rel="external noopener nofollow">this https URL</a> .

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

Maybe Only 0.5 Training Data Instruction Tuning

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning

Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

SelectLLM: Can LLMs Select Important Instructions to Annotate?

Rethinking the Instruction Quality: LIFT is What You Need

Instruction Mining: Instruction Data Selection for Tuning Large Language Models

LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement

CoachLM: Automatic Instruction Revisions Improve the Data Quality in LLM Instruction Tuning

DELIA: Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models

Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models

A Survey on Data Selection for LLM Instruction Tuning

Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning

CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data Partitions

IterSelectTune: An Iterative Training Framework for Efficient Instruction-Tuning Data Selection

SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning

LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement

Maybe Only 0.5% Data is Needed: A Preliminary Exploration of Low Training Data Instruction Tuning

Balancing Cost and Effectiveness of Synthetic Data Generation Strategies for LLMs

Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Building a Family of Data Augmentation Models for Low-cost LLM Fine-tuning on the Cloud