TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution

Jiuding Yang,Shengyao Lu,Weidong Guo,Xiangyang Li,Kaitong Yang,Yu Xu,Di Niu
2024-09-18
Abstract:Large Language Models (LLMs) require precise alignment with complex instructions to optimize their performance in real-world applications. As the demand for refined instruction tuning data increases, traditional methods that evolve simple seed instructions often struggle to effectively enhance complexity or manage difficulty scaling across various domains. Our innovative approach, Task-Centered Instruction Evolution (TaCIE), addresses these shortcomings by redefining instruction evolution from merely evolving seed instructions to a more dynamic and comprehensive combination of elements. TaCIE starts by deconstructing complex instructions into their fundamental components. It then generates and integrates new elements with the original ones, reassembling them into more sophisticated instructions that progressively increase in difficulty, diversity, and complexity. Applied across multiple domains, LLMs fine-tuned with these evolved instructions have substantially outperformed those tuned with conventional methods, marking a significant advancement in instruction-based model fine-tuning.
Computers and Society,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the need for alignment between large language models (LLMs) and complex human instructions in practical applications. Specifically, existing methods have two main problems in generating more complex instructions to improve the performance of LLMs: 1. **Insufficient Difficulty Increment Management**: Existing methods such as EVOL - INSTRUCT often have poor performance when increasing task difficulty. The hints provided are vague and lack specific guidance, making it difficult to control and predict the results of instruction evolution. For example, attempts to add a constraint often fail, or merely replace terms without truly increasing the task difficulty. 2. **Inadequate Consideration of Cross - Domain Tasks**: Existing methods fail to effectively handle the complexity of cross - domain tasks. Although some methods such as Instruction Fusion can increase the complexity of tasks by fusing two different instructions, these methods are usually limited to tasks in a single domain and lack diversity. To overcome these problems, the paper proposes the **Task - Centered Instruction Evolution (TaCIE)** method. TaCIE redefines the instruction evolution process in the following ways: - **Instruction Decomposition**: Decompose complex instructions into three basic components: background information, goals, and constraints, allowing for precise modification of each component, thereby achieving more significant instruction evolution. - **Deep Evolution**: Gradually increase the difficulty of newly generated instructions by adding new constraints or background settings, ensuring controllability of difficulty and enhancement of logical reasoning ability. - **Task Fusion**: Generate more complex and information - rich instructions by merging elements from different seed instructions, which is especially suitable for cross - domain tasks. Through these methods, TaCIE not only solves the deficiencies of existing methods in difficulty increment management and cross - domain task processing but also significantly improves the performance of LLMs in multiple benchmark tests, especially in instruction understanding, mathematics, and programming tasks.