Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution

Ziyi Ni,Yifan Li,Daxiang Dong
2024-12-18
Abstract:The exceptional capabilities of large language models (LLMs) have substantially accelerated the rapid rise and widespread adoption of agents. Recent studies have demonstrated that generating Python code to consolidate LLM-based agents' actions into a unified action space (CodeAct) is a promising approach for developing real-world LLM agents. However, this step-by-step code generation approach often lacks consistency and robustness, leading to instability in agent applications, particularly for complex reasoning and out-of-domain tasks. In this paper, we propose a novel approach called Tree-of-Code (ToC) to tackle the challenges of complex problem planning and execution with an end-to-end mechanism. By integrating key ideas from both Tree-of-Thought and CodeAct, ToC combines their strengths to enhance solution exploration. In our framework, each final code execution result is treated as a node in the decision tree, with a breadth-first search strategy employed to explore potential solutions. The final outcome is determined through a voting mechanism based on the outputs of the nodes.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the lack of stability and robustness in existing code - generation methods for complex task planning and execution. Specifically, although methods such as CodeAct can efficiently generate executable code to complete complex tasks, they lack consistency during multi - step reasoning, which easily leads to error accumulation, especially when dealing with complex reasoning and out - of - domain tasks, they are unstable. To solve these problems, the paper proposes a new method named Tree - of - Code (ToC). ### Main Problems and Solutions 1. **Limitations of Existing Methods**: - **Lack of Consistency**: Existing step - by - step code - generation methods (such as CodeAct) are prone to inconsistency during multi - step reasoning, resulting in frequent interruptions and fragmented thinking. - **Lack of Robustness**: These methods are unstable when dealing with complex reasoning and out - of - domain tasks, and are likely to produce randomness and hallucinations, thus affecting the reliability of tasks. 2. **Proposed Solutions**: - **Tree - of - Code (ToC)**: It combines the advantages of Tree - of - Thought and CodeAct, and enhances the exploration ability of solutions through a tree - structure exploration mechanism. - **End - to - End Code Generation**: By generating complete code solutions end - to - end, the dependence on intermediate execution results is reduced, thereby improving stability. - **Node Expansion and State Evaluation**: Use the decision - tree structure for parallel node execution, and reflect on and improve according to the execution results. - **Determine the Final Result by Majority Voting**: Use a large - language model to conduct majority voting on all successfully executed nodes to determine the final result. ### Specific Implementation Steps 1. **End - to - End Code Generation**: Generate a complete code solution, reduce the need for intermediate reflection, and ensure logical consistency and correctness. 2. **Exploration of Incomplete Nodes**: Explore incomplete nodes through varying prompts, large - language models, and model temperature to improve the stability of results. 3. **Determine the Final Result by Majority Voting**: Conduct majority voting on all successfully executed nodes to determine the final output. ### Experimental Verification The experimental results show that the ToC method provides more stable results on complex - task datasets, has a higher accuracy rate compared to Tree - of - Thought, and reduces the number of interaction steps compared to CodeAct, demonstrating overall effectiveness and robustness. Through the above methods, the paper aims to provide a more stable and robust framework for complex - task planning and execution to address the deficiencies of existing methods in complex reasoning and out - of - domain tasks.