Distilling LLMs' Decomposition Abilities into Compact Language Models

Denis Tarasov,Kumar Shridhar

2024-02-02

Abstract:Large Language Models (LLMs) have demonstrated proficiency in their reasoning abilities, yet their large size presents scalability challenges and limits any further customization. In contrast, compact models offer customized training but often fall short in solving complex reasoning tasks. This study focuses on distilling the LLMs' decomposition skills into compact models using offline reinforcement learning. We leverage the advancements in the LLM`s capabilities to provide feedback and generate a specialized task-specific dataset for training compact models. The development of an AI-generated dataset and the establishment of baselines constitute the primary contributions of our work, underscoring the potential of compact models in replicating complex problem-solving skills.

Computation and Language,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The paper aims to address the challenges of large language models (LLMs) in reasoning capabilities, particularly how to distill the complex problem-solving abilities of these large models into smaller models. Specifically, the paper focuses on the following points: 1. **The contradiction between model size and reasoning ability**: Although large language models exhibit strong reasoning capabilities, their massive size limits further customization possibilities and poses challenges in computational resources. 2. **Limitations of small models**: While small models are easier to customize and train, they perform poorly in solving complex reasoning tasks. 3. **Insufficient research on problem decomposition ability**: Despite extensive research on end-to-end reasoning methods, the ability to decompose complex problems into simple sub-problems has not been fully explored. To address these issues, the authors propose a method to distill the problem decomposition skills of large language models into compact models using offline reinforcement learning techniques. The main contributions include: - Creating an AI-generated dataset specifically for mathematical reasoning tasks (GSM8K-AI-SubQ) to train small models. - Establishing various baseline models to verify the effectiveness of different methods, demonstrating the potential of small models in replicating complex problem-solving skills. In summary, the paper attempts to enhance the performance of small models in tasks such as mathematical reasoning through offline reinforcement learning and AI-generated datasets, providing new directions for future research.

Distilling LLMs' Decomposition Abilities into Compact Language Models

Concise and Organized Perception Facilitates Large Language Models for Deductive Reasoning.

Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

Divide-or-Conquer? Which Part Should You Distill Your LLM?

$\texttt{LM}^\texttt{2}$: A Simple Society of Language Models Solves Complex Reasoning

Effective Distillation of Table-based Reasoning Ability from LLMs

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation

Mixed Distillation Helps Smaller Language Model Better Reasoning

Distilling Mathematical Reasoning Capabilities into Small Language Models

Disentangling Memory and Reasoning Ability in Large Language Models

Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model

Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability

CLR-Bench: Evaluating Large Language Models in College-level Reasoning

Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks

Coupling Large Language Models with Logic Programming for Robust and General Reasoning from Text

LLMs for Relational Reasoning: How Far are We?

Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data

Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning