Abstract:With instruction tuning, Large Language Models (LLMs) can enhance their ability to adhere to commands. Diverging from most works focusing on data mixing, our study concentrates on enhancing the model's capabilities from the perspective of data sampling during training. Drawing inspiration from the human learning process, where it is generally easier to master solutions to similar topics through focused practice on a single type of topic, we introduce a novel instruction tuning strategy termed CommonIT: Commonality-aware Instruction Tuning. Specifically, we cluster instruction datasets into distinct groups with three proposed metrics (Task, Embedding and Length). We ensure each training mini-batch, or "partition", consists solely of data from a single group, which brings about both data randomness across mini-batches and intra-batch data similarity. Rigorous testing on LLaMa models demonstrates CommonIT's effectiveness in enhancing the instruction-following capabilities of LLMs through IT datasets (FLAN, CoT, and Alpaca) and models (LLaMa2-7B, Qwen2-7B, LLaMa 13B, and BLOOM 7B). CommonIT consistently boosts an average improvement of 2.1\% on the general domain (i.e., the average score of Knowledge, Reasoning, Multilinguality and Coding) with the Length metric, and 5.2\% on the special domain (i.e., GSM, Openfunctions and Code) with the Task metric, and 3.8\% on the specific tasks (i.e., MMLU) with the Embedding metric. Code is available at \url{<a class="link-external link-https" href="https://github.com/raojay7/CommonIT" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the problems of inaccurate instruction understanding and decreased task - execution ability in large - language models (LLMs) during instruction - fine - tuning due to data mixing. Specifically: 1. **Inaccurate instruction understanding**: Most existing methods perform instruction - fine - tuning by mixing data from different tasks, which may lead to the model having difficulty in accurately understanding the instructions of specific tasks, thus affecting its execution effect (as shown in Figure 1). For example, the model may not be able to correctly identify the instructions of a translation task and simply reply with the final phrase. 2. **Decreased task - execution ability**: Since multiple tasks or diverse instructions are mixed together, the model may have a deviation in the understanding of specific tasks, which in turn leads to a decline in overall performance. This phenomenon is especially more obvious in a multi - task environment, where the model performs well on some tasks but poorly on others. To solve these problems, the author proposes a new method named **CommonIT**, that is, an instruction - fine - tuning strategy based on data commonality. This method improves the model's ability to understand instructions and the accuracy of task execution by dividing the instruction data set into different groups and ensuring that each trained mini - batch contains only data from a single group. ### Main contributions of CommonIT 1. **Proposing the CommonIT framework**: This framework uses data commonality to enhance the model's instruction - following ability. It includes three grouping strategies (task, embedding, length) and introduces a batch - based constraint strategy in the optimization process (§3). 2. **Wide applicability**: CommonIT shows wide applicability in multiple dimensions, including different data sets, general and professional fields, and various models. In addition, the most appropriate grouping strategies in different scenarios are also explored (§5.1). 3. **Analysis of improvement sources**: Through the exploration of commonality, possible explanations for the sources of improvement are provided (§5.2 and §5.3). Through these improvements, CommonIT significantly improves the model's performance on multiple tasks, achieving an average improvement of 2.1%, 5.2% and 3.8% in the general field and specific tasks respectively.

CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data Partitions

Instruction Tuning for Large Language Models: A Survey

Maybe Only 0.5 Training Data Instruction Tuning

Exploring the Relationship between In-Context Learning and Instruction Tuning

Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance

SelectIT: Selective Instruction Tuning for Large Language Models Via Uncertainty-Aware Self-Reflection

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

Demystifying Instruction Mixing for Fine-tuning Large Language Models

CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Instruction Tuning Vs. In-Context Learning: Revisiting Large Language Models in Few-Shot Computational Social Science

Boosting LLM via Learning from Data Iteratively and Selectively

CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with Really Good Data

Exploring Format Consistency for Instruction Tuning

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data

Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration

Mosaic-IT: Free Compositional Data Augmentation Improves Instruction Tuning

LLaMoCo: Instruction Tuning of Large Language Models for Optimization Code Generation

DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning

CITING: Large Language Models Create Curriculum for Instruction Tuning