Abstract:Despite recent advancements in large language models (LLMs), their performance on complex reasoning problems requiring multi-step thinking and combining various skills is still limited. To address this, we propose a novel framework HDFlow for complex reasoning with LLMs that combines fast and slow thinking modes in an adaptive manner. Our approach consists of two key components: 1) a new approach for slow, deliberate reasoning called Dynamic Workflow, which automatically decomposes complex problems into more manageable sub-tasks and dynamically designs a workflow to assemble specialized LLM or symbolic reasoning tools to solve sub-tasks; 2) Hybrid Thinking, a general framework that dynamically combines fast and slow thinking based on problem complexity. Finally, we propose an easy-to-scale method for automatically synthesizing a large-scale dataset of 27K challenging reasoning problems for complex reasoning and a hybrid thinking tuning method that trains smaller LLMs on this dataset to internalize the fast/slow hybrid reasoning strategies. Experiments on four reasoning benchmark datasets demonstrate that our slow thinking with dynamic workflows significantly outperforms Chain-of-Thought, and hybrid thinking achieves the highest accuracy while providing an effective balance between computational efficiency and performance. Fine-tuning using our hybrid thinking approach also significantly boosts the complex reasoning capabilities of open-source language models. The results showcase the promise of slow thinking, dynamic workflows, and hybrid thinking in expanding the frontier of complex problem-solving with LLMs\footnote{Code and data will be released at \url{<a class="link-external link-https" href="https://github.com/wenlinyao/HDFlow" rel="external noopener nofollow">this https URL</a>}.}.

What problem does this paper attempt to address?

The paper attempts to address the limitations of large language models (LLMs) in handling complex reasoning problems that require multi-step thinking and the integration of various skills. Specifically, existing methods have the following limitations when solving complex problems: 1. **Integration of Cross-Domain Knowledge and Tools**: Complex problems often require the combination of multiple knowledge domains, skills, and the use of tools. Existing methods like AlphaCodium and Alphageometry have shown the potential of combining language models with symbolic reasoning, but they rely on manually designed workflows that are specific to certain domains, lacking generality and adaptability. 2. **Single Mode of Thinking**: Traditional methods usually rely on a single mode of thinking, such as a fixed Chain-of-Thought (CoT) prompting strategy, which may perform poorly when dealing with complex tasks that require detailed analysis. 3. **Performance Degradation with Increasing Problem Complexity**: As the complexity of the problem increases, the performance of existing methods significantly declines, necessitating a framework that can scale to handle the most complex reasoning problems. To address these issues, the paper proposes a new framework called HDFlow, which combines fast thinking (System I) and more analytical slow thinking (System II), with the following two key components: 1. **Dynamic Workflow**: This is a new slow, deliberative reasoning method that can automatically decompose complex problems into more manageable sub-tasks and dynamically design workflows to assemble specialized LLMs or symbolic reasoning tools to solve these sub-tasks. 2. **Hybrid Thinking**: This is a general framework that can dynamically combine fast thinking and slow thinking based on the complexity of the problem. For simple tasks, the fast thinking mode is used by default; when the model has low confidence in the output of fast thinking, it automatically switches to the slow thinking mode, achieving more efficient and accurate problem-solving. Additionally, the paper proposes a scalable method for automatically generating a large-scale dataset containing 27K challenging reasoning problems and introduces a hybrid thinking fine-tuning method that trains open-source LLMs on this dataset to internalize the fast/slow hybrid reasoning strategy. Experimental results show that slow thinking combined with dynamic workflow significantly outperforms CoT on four reasoning benchmark datasets, while hybrid thinking achieves the highest accuracy on three datasets, effectively balancing computational efficiency and performance.

HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows

Concise and Organized Perception Facilitates Large Language Models for Deductive Reasoning.

DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models

Adaptive-Solver Framework for Dynamic Strategy Selection in Large Language Model Reasoning

Flow of Reasoning:Training LLMs for Divergent Problem Solving with Minimal Examples

Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking

Synergy-of-Thoughts: Eliciting Efficient Reasoning in Hybrid Language Models

Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning

StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows

Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Concise and Organized Perception Facilitates Reasoning in Large Language Models

Eliminating Reasoning via Inferring with Planning: A New Framework to Guide LLMs' Non-linear Thinking

Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem Solving

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Tree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoning

Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning

CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

P-FOLIO: Evaluating and Improving Logical Reasoning with Abundant Human-Written Reasoning Chains

Enhancing Language Model Reasoning via Weighted Reasoning in Self-Consistency

Can LLMs Reason in the Wild with Programs?