HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows

Wenlin Yao,Haitao Mi,Dong Yu
2024-09-26
Abstract:Despite recent advancements in large language models (LLMs), their performance on complex reasoning problems requiring multi-step thinking and combining various skills is still limited. To address this, we propose a novel framework HDFlow for complex reasoning with LLMs that combines fast and slow thinking modes in an adaptive manner. Our approach consists of two key components: 1) a new approach for slow, deliberate reasoning called Dynamic Workflow, which automatically decomposes complex problems into more manageable sub-tasks and dynamically designs a workflow to assemble specialized LLM or symbolic reasoning tools to solve sub-tasks; 2) Hybrid Thinking, a general framework that dynamically combines fast and slow thinking based on problem complexity. Finally, we propose an easy-to-scale method for automatically synthesizing a large-scale dataset of 27K challenging reasoning problems for complex reasoning and a hybrid thinking tuning method that trains smaller LLMs on this dataset to internalize the fast/slow hybrid reasoning strategies. Experiments on four reasoning benchmark datasets demonstrate that our slow thinking with dynamic workflows significantly outperforms Chain-of-Thought, and hybrid thinking achieves the highest accuracy while providing an effective balance between computational efficiency and performance. Fine-tuning using our hybrid thinking approach also significantly boosts the complex reasoning capabilities of open-source language models. The results showcase the promise of slow thinking, dynamic workflows, and hybrid thinking in expanding the frontier of complex problem-solving with LLMs\footnote{Code and data will be released at \url{<a class="link-external link-https" href="https://github.com/wenlinyao/HDFlow" rel="external noopener nofollow">this https URL</a>}.}.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the limitations of large language models (LLMs) in handling complex reasoning problems that require multi-step thinking and the integration of various skills. Specifically, existing methods have the following limitations when solving complex problems: 1. **Integration of Cross-Domain Knowledge and Tools**: Complex problems often require the combination of multiple knowledge domains, skills, and the use of tools. Existing methods like AlphaCodium and Alphageometry have shown the potential of combining language models with symbolic reasoning, but they rely on manually designed workflows that are specific to certain domains, lacking generality and adaptability. 2. **Single Mode of Thinking**: Traditional methods usually rely on a single mode of thinking, such as a fixed Chain-of-Thought (CoT) prompting strategy, which may perform poorly when dealing with complex tasks that require detailed analysis. 3. **Performance Degradation with Increasing Problem Complexity**: As the complexity of the problem increases, the performance of existing methods significantly declines, necessitating a framework that can scale to handle the most complex reasoning problems. To address these issues, the paper proposes a new framework called HDFlow, which combines fast thinking (System I) and more analytical slow thinking (System II), with the following two key components: 1. **Dynamic Workflow**: This is a new slow, deliberative reasoning method that can automatically decompose complex problems into more manageable sub-tasks and dynamically design workflows to assemble specialized LLMs or symbolic reasoning tools to solve these sub-tasks. 2. **Hybrid Thinking**: This is a general framework that can dynamically combine fast thinking and slow thinking based on the complexity of the problem. For simple tasks, the fast thinking mode is used by default; when the model has low confidence in the output of fast thinking, it automatically switches to the slow thinking mode, achieving more efficient and accurate problem-solving. Additionally, the paper proposes a scalable method for automatically generating a large-scale dataset containing 27K challenging reasoning problems and introduces a hybrid thinking fine-tuning method that trains open-source LLMs on this dataset to internalize the fast/slow hybrid reasoning strategy. Experimental results show that slow thinking combined with dynamic workflow significantly outperforms CoT on four reasoning benchmark datasets, while hybrid thinking achieves the highest accuracy on three datasets, effectively balancing computational efficiency and performance.