An Emulator for Fine-Tuning Large Language Models using Small Language Models

Eric Mitchell,Rafael Rafailov,Archit Sharma,Chelsea Finn,Christopher D. Manning

2023-10-20

Abstract:Widely used language models (LMs) are typically built by scaling up a two-stage training pipeline: a pre-training stage that uses a very large, diverse dataset of text and a fine-tuning (sometimes, 'alignment') stage that uses targeted examples or other specifications of desired behaviors. While it has been hypothesized that knowledge and skills come from pre-training, and fine-tuning mostly filters this knowledge and skillset, this intuition has not been extensively tested. To aid in doing so, we introduce a novel technique for decoupling the knowledge and skills gained in these two stages, enabling a direct answer to the question, "What would happen if we combined the knowledge learned by a large model during pre-training with the knowledge learned by a small model during fine-tuning (or vice versa)?" Using an RL-based framework derived from recent developments in learning from human preferences, we introduce emulated fine-tuning (EFT), a principled and practical method for sampling from a distribution that approximates (or 'emulates') the result of pre-training and fine-tuning at different scales. Our experiments with EFT show that scaling up fine-tuning tends to improve helpfulness, while scaling up pre-training tends to improve factuality. Beyond decoupling scale, we show that EFT enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training. Finally, a special case of emulated fine-tuning, which we call LM up-scaling, avoids resource-intensive fine-tuning of large pre-trained models by ensembling them with small fine-tuned models, essentially emulating the result of fine-tuning the large pre-trained model. Up-scaling consistently improves helpfulness and factuality of instruction-following models in the Llama, Llama-2, and Falcon families, without additional hyperparameters or training.

Computation and Language,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The paper aims to address the following issues: 1. **Separating Knowledge Acquisition in Pre-training and Fine-tuning Stages**: The paper introduces a new technique called "Emulated Fine-Tuning" (EFT) to separate the knowledge and skills acquired during the pre-training and fine-tuning stages of models of different scales. This allows researchers to independently study the impact of scaling up or down only one stage. 2. **Improving Fine-tuning Efficiency**: With EFT, it is possible to simulate the results of large-scale fine-tuning without actually performing the resource-intensive large-scale fine-tuning process, thereby avoiding the high resource costs. 3. **Dynamically Adjusting Behavioral Characteristics**: EFT also allows for the adjustment of the model's behavioral characteristics (such as usefulness and harmlessness) during testing without requiring additional training processes. In summary, the core objective of this paper is to enhance model performance while reducing computational costs in the training process of language models by separating the impacts of pre-training and fine-tuning stages and utilizing the emulated fine-tuning technique.

An Emulator for Fine-Tuning Large Language Models using Small Language Models

Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities

Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs

Scalable Fine-tuning from Multiple Data Sources: A First-Order Approximation Approach

Enhancing Large Language Model Performance To Answer Questions and Extract Information More Accurately

Large Language Models for Tuning Evolution Strategies

Selecting Large Language Model to Fine-tune via Rectified Scaling Law

CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model

CombLM: Adapting Black-Box Language Models through Small Fine-Tuned Models

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

Empirical Analysis of Efficient Fine-Tuning Methods for Large Pre-Trained Language Models

Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model

The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities

A Large-scale Empirical Study on Fine-tuning Large Language Models for Unit Testing

Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process

LIMA: Less Is More for Alignment

Revisiting the Superficial Alignment Hypothesis

Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models

Large Language Models As Evolution Strategies

Reflect-RL: Two-Player Online RL Fine-Tuning for LMs