CITING: Large Language Models Create Curriculum for Instruction Tuning

Tao Feng,Zifeng Wang,Jimeng Sun
2023-10-04
Abstract:The recent advancement of large language models (LLMs) has been achieved through a combo of instruction tuning and human alignment. However, building manually crafted instruction datasets and performing human alignment become the bottleneck for scaling the development of LLMs. In this paper, we exploit the idea of leveraging AI models in lieu of humans as the teacher to train student LLMs. Our method is inspired by how human students refine their writing skills by following the rubrics and learning from the revisions offered by their tutors. Specifically, we employ a teacher LLM to create a curriculum for instruction tuning of the student LLM, namely Curriculum Instruction TunING (CITING). It encompasses two main steps: (1) the teacher LLM crafts the rubrics for evaluating the answers corresponding to various types of questions, and (2) the student LLM learns to follow the rubrics and perform self-correction from the revision made by the teacher. We further iteratively carry out it to embody the procedure of CITING. We compare CITING to a series of state-of-the-art baselines on four datasets. Our method demonstrates strong improvement in terms of articulate, in-depth, and comprehensive by GPT-4 evaluation. Specifically, it achieves an average winning rate of 79.4% over SFT, 73.4% over RLHF, 78.1% over RRHF, and 76.3% over RAFT, respectively.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the bottlenecks existing in the instruction - tuning and human - alignment processes of large - scale language models (LLMs). Specifically, constructing high - quality hand - made instruction datasets and conducting human - alignment are costly and time - consuming, which has become the main obstacle to the development of LLMs. To solve these problems, the paper proposes a new method - Curriculum Instruction TunING (CITING), which uses advanced teacher LLMs to generate curricula to guide the learning process of student LLMs, thereby reducing the dependence on manual annotation and improving model performance. ### Main Contributions 1. **Curriculum Design and Standard Setting**: Through teacher LLMs, evaluation criteria are formulated for different types of questions. These criteria are not only used to evaluate the quality of students' answers but also provide additional guidance to help student LLMs correct wrong answers. 2. **Learning and Revision**: Based on the initial responses of student LLMs, teacher LLMs provide personalized revision suggestions. By comparing the answers before and after revision, student LLMs can improve their responses through self - reflection. This process can be iterated to further enhance the performance of student LLMs. ### Experimental Results The paper conducted experiments on four datasets, namely Alpaca, World Knowledge, Reading Comprehension, and Commonsense Reasoning. The experimental results show that CITING significantly outperforms existing baseline methods on all metrics, especially in zero - sample tasks. Specifically: - **Articulate (Clarity)**: Evaluate the structure, language quality, and overall readability of responses. - **In - depth (Depth)**: Evaluate the depth and details of coverage of the topic or question. - **Comprehensive (Comprehensiveness)**: Evaluate the breadth of responses, covering multiple angles of relevant aspects. ### Conclusion CITING effectively reduces the dependence on manual annotation and significantly improves the performance of student LLMs by using teacher LLMs to generate curricula and revision suggestions. This method performs well on multiple datasets, especially in common - sense reasoning tasks, showing strong generalization and reasoning abilities.