Abstract:Recent advancements in large language models (LLMs) have significantly advanced the automation of software development tasks, including code synthesis, program repair, and test generation. More recently, researchers and industry practitioners have developed various autonomous LLM agents to perform end-to-end software development tasks. These agents are equipped with the ability to use tools, run commands, observe feedback from the environment, and plan for future actions. However, the complexity of these agent-based approaches, together with the limited abilities of current LLMs, raises the following question: Do we really have to employ complex autonomous software agents? To attempt to answer this question, we build Agentless -- an agentless approach to automatically solve software development problems. Compared to the verbose and complex setup of agent-based approaches, Agentless employs a simplistic three-phase process of localization, repair, and patch validation, without letting the LLM decide future actions or operate with complex tools. Our results on the popular SWE-bench Lite benchmark show that surprisingly the simplistic Agentless is able to achieve both the highest performance (32.00%, 96 correct fixes) and low cost ($0.70) compared with all existing open-source software agents! Furthermore, we manually classified the problems in SWE-bench Lite and found problems with exact ground truth patch or insufficient/misleading issue descriptions. As such, we construct SWE-bench Lite-S by excluding such problematic issues to perform more rigorous evaluation and comparison. Our work highlights the current overlooked potential of a simple, interpretable technique in autonomous software development. We hope Agentless will help reset the baseline, starting point, and horizon for autonomous software agents, and inspire future work along this crucial direction.

What problem does this paper attempt to address?

The problem this paper attempts to address is whether current software development automation methods based on large language models (LLMs) truly require complex autonomous agents. Specifically, the authors question whether it is necessary to use complex and autonomous agents to complete end-to-end software development tasks, which include code synthesis, program repair, and test generation. The authors believe that existing agent-based methods have the following issues: 1. **Complexity of tool usage and design**: To utilize tools, existing methods often introduce an abstraction layer between the agent and the environment, which requires carefully designed input/output formats and can easily lead to inaccurate tool design or usage. 2. **Lack of control over decision planning**: Existing methods delegate the decision-making process to agents, allowing them to decide when and what actions to perform, which can lead to agents getting lost in a vast space of possible actions and making suboptimal explorations. 3. **Limited self-reflection capability**: Existing agents find it difficult to filter or correct irrelevant, incorrect, or misleading information, which can lead to erroneous steps being amplified and affecting subsequent decisions. To address these issues, the authors propose an agentless approach—AGENTLESS—for automatically solving software development problems. Compared to complex agent methods, AGENTLESS adopts a simple three-stage process: localization, repair, and patch validation, without requiring the LLM to autonomously decide future actions or operate complex tools. By evaluating on the popular SWE-bench Lite benchmark, the authors found that AGENTLESS not only achieved the highest performance (32.00%, 96 correct repairs) but also at a very low cost ($0.70). Additionally, the authors conducted a detailed manual analysis of the SWE-bench Lite dataset, identified existing issues, and constructed SWE-bench Lite-S to eliminate these issues and provide a more rigorous evaluation standard. This work highlights the potential of simple, cost-effective techniques in autonomous software development, hoping that AGENTLESS can reset the baseline, starting point, and prospects for autonomous software agents, inspiring future research.

Agentless: Demystifying LLM-based Software Engineering Agents

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

An Empirical Study on LLM-based Agents for Automated Bug Fixing

Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub Scenarios

HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale

AutoAct: Automatic Agent Learning from Scratch for QA Via Self-Planning

From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future

Autonomous Agents in Software Development: A Vision Paper

MarsCode Agent: AI-native Automated Bug Fixing

A Unified Debugging Approach via LLM-Based Multi-Agent Synergy

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

CodeAgent: Autonomous Communicative Agents for Code Review

AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System

AutoCodeRover: Autonomous Program Improvement

Agent S: An Open Agentic Framework that Uses Computers Like a Human

Agents in Software Engineering: Survey, Landscape, and Vision

Defining and Detecting the Defects of the Large Language Model-based Autonomous Agents

CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges

PentestAgent: Incorporating LLM Agents to Automated Penetration Testing