CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Jierui Li,Hung Le,Yinbo Zhou,Caiming Xiong,Silvio Savarese,Doyen Sahoo

2024-11-07

Abstract:Pre-trained on massive amounts of code and text data, large language models (LLMs) have demonstrated remarkable achievements in performing code generation tasks. With additional execution-based feedback, these models can act as agents with capabilities to self-refine and improve generated code autonomously. However, on challenging coding tasks with extremely large search space, current agentic approaches still struggle with multi-stage planning, generating, and debugging. To address this problem, we propose CodeTree, a framework for LLM agents to efficiently explore the search space in different stages of the code generation process. Specifically, we adopted a unified tree structure to explicitly explore different coding strategies, generate corresponding coding solutions, and subsequently refine the solutions. In each stage, critical decision-making (ranking, termination, expanding) of the exploration process is guided by both the environmental execution-based feedback and LLM-agent-generated feedback. We comprehensively evaluated CodeTree on 7 code generation benchmarks and demonstrated the significant performance gains of CodeTree against strong baselines. Using GPT-4o as the base model, we consistently achieved top results of 95.1 on HumanEval, 98.7 on MBPP, and 43.0 on CodeContests. On the challenging SWEBench benchmark, our approach led to significant performance gains.

Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to effectively explore different stages in the code generation process when large - language models (LLMs) perform code - generation tasks, especially in the face of complex programming tasks with extremely large search spaces, and the difficulties that current methods encounter in multi - stage planning, generation, and debugging. To solve this problem, the paper proposes the CodeTree framework, which is a tree - structure - based method aiming to improve the quality and efficiency of code generation by explicitly exploring different coding strategies, generating corresponding coding solutions, and then optimizing these solutions. Specifically, CodeTree realizes the effective exploration of the search space by combining environmental execution feedback and feedback generated by LLM agents to guide key decisions (such as ranking, termination, expansion) at each stage.

CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models

Tree-of-Code: A Tree-Structured Exploring Framework for End-to-End Code Generation and Execution in Complex Task Handling

Seed-CTS: Unleashing the Power of Tree Search for Superior Performance in Competitive Coding Tasks

Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution

TreeGen: A Tree-Based Transformer Architecture for Code Generation

Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

Tree Search for Language Model Agents

Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation

Autonomous Tree-search Ability of Large Language Models

AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges

On the Effectiveness of Large Language Models in Domain-Specific Code Generation

Enabling Programming Thinking in Large Language Models Toward Code Generation

Scattered Forest Search: Smarter Code Space Exploration with LLMs

Steering Large Language Models between Code Execution and Textual Reasoning

Planning with Large Language Models for Code Generation

A Study on Training and Developing Large Language Models for Behavior Tree Generation

Effective Large Language Model Debugging with Best-first Tree Search

PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback

Improving Tree-Structured Decoder Training for Code Generation Via Mutual Learning