Abstract:The exceptional capabilities of large language models (LLMs) have substantially accelerated the rapid rise and widespread adoption of agents. Recent studies have demonstrated that generating Python code to consolidate LLM-based agents' actions into a unified action space (CodeAct) is a promising approach for developing real-world LLM agents. However, this step-by-step code generation approach often lacks consistency and robustness, leading to instability in agent applications, particularly for complex reasoning and out-of-domain tasks. In this paper, we propose a novel approach called Tree-of-Code (ToC) to tackle the challenges of complex problem planning and execution with an end-to-end mechanism. By integrating key ideas from both Tree-of-Thought and CodeAct, ToC combines their strengths to enhance solution exploration. In our framework, each final code execution result is treated as a node in the decision tree, with a breadth-first search strategy employed to explore potential solutions. The final outcome is determined through a voting mechanism based on the outputs of the nodes.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the lack of stability and robustness in existing code - generation methods for complex task planning and execution. Specifically, although methods such as CodeAct can efficiently generate executable code to complete complex tasks, they lack consistency during multi - step reasoning, which easily leads to error accumulation, especially when dealing with complex reasoning and out - of - domain tasks, they are unstable. To solve these problems, the paper proposes a new method named Tree - of - Code (ToC). ### Main Problems and Solutions 1. **Limitations of Existing Methods**: - **Lack of Consistency**: Existing step - by - step code - generation methods (such as CodeAct) are prone to inconsistency during multi - step reasoning, resulting in frequent interruptions and fragmented thinking. - **Lack of Robustness**: These methods are unstable when dealing with complex reasoning and out - of - domain tasks, and are likely to produce randomness and hallucinations, thus affecting the reliability of tasks. 2. **Proposed Solutions**: - **Tree - of - Code (ToC)**: It combines the advantages of Tree - of - Thought and CodeAct, and enhances the exploration ability of solutions through a tree - structure exploration mechanism. - **End - to - End Code Generation**: By generating complete code solutions end - to - end, the dependence on intermediate execution results is reduced, thereby improving stability. - **Node Expansion and State Evaluation**: Use the decision - tree structure for parallel node execution, and reflect on and improve according to the execution results. - **Determine the Final Result by Majority Voting**: Use a large - language model to conduct majority voting on all successfully executed nodes to determine the final result. ### Specific Implementation Steps 1. **End - to - End Code Generation**: Generate a complete code solution, reduce the need for intermediate reflection, and ensure logical consistency and correctness. 2. **Exploration of Incomplete Nodes**: Explore incomplete nodes through varying prompts, large - language models, and model temperature to improve the stability of results. 3. **Determine the Final Result by Majority Voting**: Conduct majority voting on all successfully executed nodes to determine the final output. ### Experimental Verification The experimental results show that the ToC method provides more stable results on complex - task datasets, has a higher accuracy rate compared to Tree - of - Thought, and reduces the number of interaction steps compared to CodeAct, demonstrating overall effectiveness and robustness. Through the above methods, the paper aims to provide a more stable and robust framework for complex - task planning and execution to address the deficiencies of existing methods in complex reasoning and out - of - domain tasks.

Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution

CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Tree of Problems: Improving structured problem solving with compositionality

Tree-Planner: Efficient Close-loop Task Planning with Large Language Models

Executable Code Actions Elicit Better LLM Agents

Tree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoning

Consolidating Trees of Robotic Plans Generated Using Large Language Models to Improve Reliability

When Do Program-of-Thought Works for Reasoning?

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

ConceptAgent: LLM-Driven Precondition Grounding and Tree Search for Robust Task Planning and Execution

Robot Task Planning Based on Large Language Model Representing Knowledge with Directed Graph Structures

Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning

AI Chain on Large Language Model for Unsupervised Control Flow Graph Generation for Statically-Typed Partial Code

CoAct: A Global-Local Hierarchy for Autonomous Agent Collaboration

Interactive and Expressive Code-Augmented Planning with Large Language Models

Instruct, Not Assist: LLM-based Multi-Turn Planning and Hierarchical Questioning for Socratic Code Debugging

Integrating Intent Understanding and Optimal Behavior Planning for Behavior Tree Generation from Human Instructions

Large Language Model Guided Tree-of-Thought

LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning

Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation