Abstract:Large language models (LLMs) have demonstrated impressive capability in reasoning and planning when integrated with tree-search-based prompting methods. However, since these methods ignore the previous search experiences, they often make the same mistakes in the search process. To address this issue, we introduce Reflection on search Trees (RoT), an LLM reflection framework designed to improve the performance of tree-search-based prompting methods. It uses a strong LLM to summarize guidelines from previous tree search experiences to enhance the ability of a weak LLM. The guidelines are instructions about solving this task through tree search which can prevent the weak LLMs from making similar mistakes in the past search process. In addition, we proposed a novel state selection method, which identifies the critical information from historical search processes to help RoT generate more specific and meaningful guidelines. In our extensive experiments, we find that RoT significantly improves the performance of LLMs in reasoning or planning tasks with various tree-search-based prompting methods (e.g., BFS and MCTS). Non-tree-search-based prompting methods such as Chain-of-Thought (CoT) can also benefit from RoT guidelines since RoT can provide task-specific knowledge collected from the search experience.

What problem does this paper attempt to address?

### The Problem Addressed by the Paper The paper "RoT: Enhancing Large Language Models with Reflection on Search Trees" aims to address the issue of large language models (LLMs) repeatedly making mistakes in tree search methods. #### Background Problem Although existing tree search methods can significantly improve the performance of models in multi-step reasoning or planning tasks, they overlook past search experiences, leading to repeated mistakes during the search process. Specifically, these issues include: - Incorrectly evaluating actions. - Generating actions that lead to inefficient results. - Failing to accurately predict the next state. These problems result in low accuracy and poor search efficiency, causing the model to over-explore erroneous action paths. #### Solution To address the above issues, the authors introduce a new framework—**Reflection on Search Trees (RoT)**. The main goal of RoT is to improve the performance of tree search methods by reflecting on past search experiences. Specifically, RoT uses a powerful LLM to summarize guiding principles from past search processes and applies these principles to enhance a weaker LLM, thereby avoiding repeated mistakes and improving decision-making capabilities. #### Key Techniques 1. **Important State Selection**: Selecting key states from the generated search tree that have a significant impact on the final result. 2. **Guiding Principle Generation**: Generating specific guiding principles based on the selected important states to help the model make better decisions in future search processes. 3. **Iterative Improvement**: Gradually optimizing the search tree and guiding principles through multiple applications of RoT, further enhancing the model's performance. #### Experimental Validation The authors evaluated the effectiveness of RoT on several complex reasoning and planning tasks, including: - **Blocksworld**: Manipulating blocks to reach a target state from an initial state. - **GSM8k**: Mathematical reasoning tasks. - **CraigslistBargain**: Bargaining tasks between buyers and sellers. Experimental results show that RoT significantly improves the performance of various LLMs in these tasks, especially as the task difficulty increases, the effect of RoT becomes more pronounced. ### Summary RoT effectively addresses the issue of LLMs repeatedly making mistakes in tree search methods by reflecting on past search experiences and generating guiding principles. This improves the model's search efficiency and accuracy. This method is not only applicable to tree search methods but can also enhance the performance of non-tree search methods (such as chain-of-thought).

RoT: Enhancing Large Language Models with Reflection on Search Trees

Autonomous Tree-search Ability of Large Language Models

Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning

Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search

Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

Effective Large Language Model Debugging with Best-first Tree Search

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up

Large Language Model Guided Tree-of-Thought

Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models

Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting

BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving

Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models

Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Let's Be Self-generated via Step by Step: A Curriculum Learning Approach to Automated Reasoning with Large Language Models

Making Large Language Models Better Reasoners with Orchestrated Streaming Experiences

Tree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoning

Reasoning with Language Model is Planning with World Model

Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation

HBTP: Heuristic Behavior Tree Planning with Large Language Model Reasoning