Abstract:Enhancing the capability of large language models (LLMs) in reasoning has gained significant attention in recent years. Previous studies have demonstrated the effectiveness of various prompting strategies in aiding LLMs in reasoning (called "reasoning actions"), such as step-by-step thinking, reflecting before answering, solving with programs, and their combinations. However, these approaches often applied static, predefined reasoning actions uniformly to all questions, without considering the specific characteristics of each question or the capability of the task-solving LLM. In this paper, we propose DOTS, an approach enabling LLMs to reason dynamically via optimal reasoning trajectory search, tailored to the specific characteristics of each question and the inherent capability of the task-solving LLM. Our approach involves three key steps: i) defining atomic reasoning action modules that can be composed into various reasoning action trajectories; ii) searching for the optimal action trajectory for each training question through iterative exploration and evaluation for the specific task-solving LLM; and iii) using the collected optimal trajectories to train an LLM to plan for the reasoning trajectories of unseen questions. In particular, we propose two learning paradigms, i.e., fine-tuning an external LLM as a planner to guide the task-solving LLM, or directly fine-tuning the task-solving LLM with an internalized capability for reasoning actions planning. Our experiments across eight reasoning tasks show that our method consistently outperforms static reasoning techniques and the vanilla instruction tuning approach. Further analysis reveals that our method enables LLMs to adjust their computation based on problem complexity, allocating deeper thinking and reasoning to harder problems.

First Heuristic Then Rational: Dynamic Use of Heuristics in Language Model Reasoning

Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning

Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models

Why think step by step? Reasoning emerges from the locality of experience

Think Beyond Size: Adaptive Prompting for More Effective Reasoning

Break the Chain: Large Language Models Can be Shortcut Reasoners

Implicit Chain of Thought Reasoning via Knowledge Distillation

Rational Metareasoning for Large Language Models

Plan of Thoughts: Heuristic-Guided Problem Solving with Large Language Models

Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models

Heuristic Reasoning in AI: Instrumental Use and Mimetic Absorption

Reasoning with Large Language Models, a Survey

Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies

From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning

Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought

Boosting Language Models Reasoning with Chain-of-Knowledge Prompting

Reasoning Beyond Bias: A Study on Counterfactual Prompting and Chain of Thought Reasoning

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

An electronic blood-cell counting machine.

Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models