Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation

Xunyu Zhu,Jian Li,Can Ma,Weiping Wang
2024-11-22
Abstract:Large Language Models (LLMs) demonstrate exceptional reasoning capabilities, often achieving state-of-the-art performance in various tasks. However, their substantial computational and memory demands, due to billions of parameters, hinder deployment in resource-constrained environments. A promising solution is knowledge distillation, where LLMs transfer reasoning capabilities to Small Language Models (SLMs, $\le$ 1B parameters), enabling wider deployment on low-resource devices. Existing methods primarily focus on generating high-quality reasoning rationales for distillation datasets but often neglect the critical role of data quantity and quality. To address these challenges, we propose a Feedback-Driven Distillation (FDD) framework to enhance SLMs' mathematical reasoning capabilities. In the initialization stage, a distillation dataset is constructed by prompting LLMs to pair mathematical problems with corresponding reasoning rationales. We classify problems into easy and hard categories based on SLM performance. For easy problems, LLMs generate more complex variations, while for hard problems, new questions of similar complexity are synthesized. In addition, we propose a multi-round distillation paradigm to iteratively enrich the distillation datasets, thereby progressively improving the mathematical reasoning abilities of SLMs. Experimental results demonstrate that our method can make SLMs achieve SOTA mathematical reasoning performance.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to enhance the mathematical reasoning ability of small language models (SLMs) in resource - constrained environments through the feedback - driven distillation (FDD) framework. Specifically, the paper focuses on how to effectively generate high - quality datasets to improve the mathematical reasoning performance of SLMs, enabling them to reach or approach the level of large language models (LLMs), while maintaining low computational and memory requirements. ### Background of the Paper - **Large Language Models (LLMs)**: Although LLMs perform well in various reasoning tasks, their large number of parameters (ranging from billions to tens of billions) results in high computational costs and memory requirements, limiting their deployment in resource - constrained environments. - **Knowledge Distillation**: A method of transferring the knowledge of LLMs to SLMs, enabling SLMs to run on low - resource devices while maintaining strong reasoning performance. - **Limitations of Existing Methods**: Most existing methods mainly focus on generating high - quality reasoning - rationale datasets, but overlook the impact of data quantity and data quality on mathematical reasoning ability. ### Solution The paper proposes a feedback - driven distillation (FDD) framework, aiming to enhance the mathematical reasoning ability of SLMs through the following steps: 1. **Initialization Phase**: Use LLMs to construct an initial mathematical distillation dataset, with each problem accompanied by a corresponding Program - of - Thought (PoT) reasoning rationale. These data are used for the initial training of SLMs. 2. **Question Generation Phase**: Based on the performance of SLMs on the initial dataset, divide the questions into two categories: simple and difficult. For simple questions, LLMs generate more complex questions; for difficult questions, LLMs generate new questions of the same difficulty. These newly generated questions are added to the distillation dataset, increasing the complexity and diversity of the dataset. 3. **Fine - Tuning Phase**: Use the expanded distillation dataset to fine - tune SLMs from scratch, further enhancing their mathematical reasoning ability. 4. **Multi - round Distillation Paradigm**: Through multiple rounds of iteration, continuously enrich the distillation dataset and gradually improve the mathematical reasoning ability of SLMs. ### Experimental Results - **Main Experimental Results**: The paper conducted experiments on multiple mathematical reasoning datasets, and the results show that the FDD framework can significantly improve the mathematical reasoning performance of SLMs, even reaching or exceeding some open - source large - scale language models (such as Llama - 2, WizardMath, etc.). - **Transferability**: The FDD framework not only improves the in - domain mathematical reasoning ability of SLMs, but also enhances their out - of - domain mathematical reasoning ability. - **Impact of Model Size**: Experimental results indicate that the model size of SLMs has a significant impact on their mathematical reasoning performance, and larger models usually perform better. ### Conclusion The FDD framework proposed in the paper effectively enhances the mathematical reasoning ability of SLMs by generating high - quality and diverse datasets, enabling them to perform high - performance mathematical reasoning tasks in resource - constrained environments. This method not only performs excellently in - domain, but also has good transferability and is suitable for multiple mathematical reasoning tasks.