Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

Gurusha Juneja,Subhabrata Dutta,Soumen Chakrabarti,Sunny Manchanda,Tanmoy Chakraborty
2024-02-27
Abstract:Large Language Models (LLMs) prompted to generate chain-of-thought (CoT) exhibit impressive reasoning capabilities. Recent attempts at prompt decomposition toward solving complex, multi-step reasoning problems depend on the ability of the LLM to simultaneously decompose and solve the problem. A significant disadvantage is that foundational LLMs are typically not available for fine-tuning, making adaptation computationally prohibitive. We believe (and demonstrate) that problem decomposition and solution generation are distinct capabilites, better addressed in separate modules, than by one monolithic LLM. We introduce DaSLaM, which uses a decomposition generator to decompose complex problems into subproblems that require fewer reasoning steps. These subproblems are answered by a solver. We use a relatively small (13B parameters) LM as the decomposition generator, which we train using policy gradient optimization to interact with a solver LM (regarded as black-box) and guide it through subproblems, thereby rendering our method solver-agnostic. Evaluation on multiple different reasoning datasets reveal that with our method, a 175 billion parameter LM (text-davinci-003) can produce competitive or even better performance, compared to its orders-of-magnitude larger successor, GPT-4. Additionally, we show that DaSLaM is not limited by the solver's capabilities as a function of scale; e.g., solver LMs with diverse sizes give significant performance improvement with our solver-agnostic decomposition technique. Exhaustive ablation studies evince the superiority of our modular finetuning technique over exorbitantly large decomposer LLMs, based on prompting alone.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the challenges faced by large language models (LLMs) when dealing with complex multi-step reasoning problems. Specifically: 1. **Separation of Decomposition and Solving**: - Existing methods typically rely on a large language model to simultaneously perform both problem decomposition and solving tasks, requiring the model to possess both decomposition and solving capabilities, leading to large model sizes and difficulty in fine-tuning. - The paper proposes separating the problem decomposition and solving functions, with a smaller model dedicated to problem decomposition and another model responsible for solving. 2. **Modular Fine-Tuning**: - A method named DaSLaM is proposed, which includes a decomposition generator (using a smaller 1.3 billion parameter model) optimized via policy gradient to guide the solver (treated as a black box). - This method demonstrates that modular fine-tuning techniques are more effective compared to directly using ultra-large models. 3. **Performance Improvement**: - Experiments on multiple datasets show that DaSLaM can significantly enhance the performance of existing models (such as GPT-3.5), even rivaling larger models (such as GPT-4). - In some tasks, the performance of the DaSLaM-enhanced GPT-3.5 model surpasses that of GPT-4. 4. **Flexibility and Robustness**: - DaSLaM not only improves model performance but also demonstrates robustness when facing challenging datasets, without relying on the solver's capability scale. Through these improvements, the paper showcases the effectiveness of enhancing complex reasoning task handling capabilities via modular and specialized approaches.