Abstract:A common method to solve complex problems in software engineering, is to divide the problem into multiple sub-problems. Inspired by this, we propose a Modular Architecture for Software-engineering AI (MASAI) agents, where different LLM-powered sub-agents are instantiated with well-defined objectives and strategies tuned to achieve those objectives. Our modular architecture offers several advantages: (1) employing and tuning different problem-solving strategies across sub-agents, (2) enabling sub-agents to gather information from different sources scattered throughout a repository, and (3) avoiding unnecessarily long trajectories which inflate costs and add extraneous context. MASAI enabled us to achieve the highest performance (28.33% resolution rate) on the popular and highly challenging SWE-bench Lite dataset consisting of 300 GitHub issues from 11 Python repositories. We conduct a comprehensive evaluation of MASAI relative to other agentic methods and analyze the effects of our design decisions and their contribution to the success of MASAI.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve The paper aims to address the automation of solving complex problems in software engineering. Specifically, the authors propose a modular architecture (MASAI) for building AI agents in software engineering. MASAI solves problems by decomposing complex issues into multiple sub-problems and instantiating sub-agents with clear goals and strategies for each sub-problem. The main objectives of this approach include: 1. **Adopting different problem-solving strategies**: Different sub-agents can use different strategies (such as ReAct or CoT) to solve specific sub-problems. 2. **Collecting information from different sources**: Sub-agents can gather information from different files in the code repository. 3. **Avoiding unnecessary long reasoning paths**: This can reduce reasoning costs and avoid passing redundant information, thereby improving performance. ### Main Contributions 1. **Proposing a modular architecture**: MASAI allows for the individual optimization of sub-agent designs while combining them to solve larger end-to-end software engineering tasks. 2. **Demonstrating the effectiveness of MASAI**: MASAI achieved the highest problem-solving rate (28.33%) on the SWE-bench Lite dataset. 3. **Detailed analysis of design decisions**: A thorough investigation of the key design decisions of MASAI and existing methods to guide future research and development. 4. **Contributing results**: Submitting results to the SWE-bench Lite leaderboard for validation. ### Experimental Setup - **Dataset**: SWE-bench Lite, containing 300 software engineering tasks from 11 open-source repositories, mainly focusing on bug fixes. - **Metrics**: Resolution (percentage of problems successfully solved), localization rate (percentage of proposed patches that fully cover the real patch files), and application rate (percentage of proposed patches successfully applied to the repository). - **Comparison methods**: Compared with various existing methods, including SWE-agent, AutoCodeRover, OpenDevin, Aider, CodeR, Moatless, RAG, etc. ### Results - **Performance**: MASAI achieved the highest resolution (28.33%) on the SWE-bench Lite dataset, tied with CodeR. - **Assumptions**: MASAI avoids reliance on external signals (such as expert hints), relying only on the standard SWE-bench Lite setup. - **Fault localization**: MASAI's Edit Localizer successfully localized the problem in 75% of cases, while other methods had lower localization rates. - **Multi-step reasoning**: Through ReAct strategy and tool design, Edit Localizer can flexibly and robustly perform multi-step reasoning to effectively locate problems. ### Examples - **Example 1**: In the scikit-learn__scikit-learn-13142 task, Edit Localizer found the root cause of the problem through multi-step reasoning. - **Example 2**: In the astropy__astropy-14995 task, the READ action returned an approximate match, helping the sub-agent find the actual faulty function. - **Example 3**: In the matplotlib__matplotlib-25332 task, basic shell commands helped Edit Localizer find the class that needed editing. ### Summary The paper demonstrates how to effectively solve problems in software engineering by proposing the MASAI modular architecture, which decomposes complex problems and uses different sub-agents. MASAI outperformed existing methods on the SWE-bench Lite dataset, proving its potential in automating software engineering tasks.

MASAI: Modular Architecture for Software-engineering AI Agents

A Multi-AI Agent System for Autonomous Optimization of Agentic AI Solutions via Iterative Refinement and LLM-Driven Feedback Loops

Experimenting with Multi-Agent Software Development: Towards a Unified Platform

Agent S: An Open Agentic Framework that Uses Computers Like a Human

AgentSquare: Automatic LLM Agent Search in Modular Design Space

Agent Architecture and Collaboration for Supply Chain Management

A new approach of designing Multi-Agent Systems

Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks

Self-Organized Agents: A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

A Survey on LLM-based Multi-Agent Systems: Workflow, Infrastructure, and Challenges

Automated Design of Agentic Systems

Safeguarding AI Agents: Developing and Analyzing Safety Architectures

Improving Performance of Commercially Available AI Products in a Multi-Agent Configuration

Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents

S-Agents: Self-Organizing Agents in Open-Ended Environment.

A Multi-agent Architecture Based Cooperation and Intelligent Decision Making Method for Multirobot Systems

S-Agents: Self-organizing Agents in Open-ended Environments

Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents

Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System