Agentless: Demystifying LLM-based Software Engineering Agents

Chunqiu Steven Xia,Yinlin Deng,Soren Dunn,Lingming Zhang
2024-10-30
Abstract:Recent advancements in large language models (LLMs) have significantly advanced the automation of software development tasks, including code synthesis, program repair, and test generation. More recently, researchers and industry practitioners have developed various autonomous LLM agents to perform end-to-end software development tasks. These agents are equipped with the ability to use tools, run commands, observe feedback from the environment, and plan for future actions. However, the complexity of these agent-based approaches, together with the limited abilities of current LLMs, raises the following question: Do we really have to employ complex autonomous software agents? To attempt to answer this question, we build Agentless -- an agentless approach to automatically solve software development problems. Compared to the verbose and complex setup of agent-based approaches, Agentless employs a simplistic three-phase process of localization, repair, and patch validation, without letting the LLM decide future actions or operate with complex tools. Our results on the popular SWE-bench Lite benchmark show that surprisingly the simplistic Agentless is able to achieve both the highest performance (32.00%, 96 correct fixes) and low cost ($0.70) compared with all existing open-source software agents! Furthermore, we manually classified the problems in SWE-bench Lite and found problems with exact ground truth patch or insufficient/misleading issue descriptions. As such, we construct SWE-bench Lite-S by excluding such problematic issues to perform more rigorous evaluation and comparison. Our work highlights the current overlooked potential of a simple, interpretable technique in autonomous software development. We hope Agentless will help reset the baseline, starting point, and horizon for autonomous software agents, and inspire future work along this crucial direction.
Software Engineering,Artificial Intelligence,Computation and Language,Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is whether current software development automation methods based on large language models (LLMs) truly require complex autonomous agents. Specifically, the authors question whether it is necessary to use complex and autonomous agents to complete end-to-end software development tasks, which include code synthesis, program repair, and test generation. The authors believe that existing agent-based methods have the following issues: 1. **Complexity of tool usage and design**: To utilize tools, existing methods often introduce an abstraction layer between the agent and the environment, which requires carefully designed input/output formats and can easily lead to inaccurate tool design or usage. 2. **Lack of control over decision planning**: Existing methods delegate the decision-making process to agents, allowing them to decide when and what actions to perform, which can lead to agents getting lost in a vast space of possible actions and making suboptimal explorations. 3. **Limited self-reflection capability**: Existing agents find it difficult to filter or correct irrelevant, incorrect, or misleading information, which can lead to erroneous steps being amplified and affecting subsequent decisions. To address these issues, the authors propose an agentless approach—AGENTLESS—for automatically solving software development problems. Compared to complex agent methods, AGENTLESS adopts a simple three-stage process: localization, repair, and patch validation, without requiring the LLM to autonomously decide future actions or operate complex tools. By evaluating on the popular SWE-bench Lite benchmark, the authors found that AGENTLESS not only achieved the highest performance (32.00%, 96 correct repairs) but also at a very low cost ($0.70). Additionally, the authors conducted a detailed manual analysis of the SWE-bench Lite dataset, identified existing issues, and constructed SWE-bench Lite-S to eliminate these issues and provide a more rigorous evaluation standard. This work highlights the potential of simple, cost-effective techniques in autonomous software development, hoping that AGENTLESS can reset the baseline, starting point, and prospects for autonomous software agents, inspiring future research.