MASAI: Modular Architecture for Software-engineering AI Agents

Daman Arora,Atharv Sonwane,Nalin Wadhwa,Abhav Mehrotra,Saiteja Utpala,Ramakrishna Bairi,Aditya Kanade,Nagarajan Natarajan
2024-06-17
Abstract:A common method to solve complex problems in software engineering, is to divide the problem into multiple sub-problems. Inspired by this, we propose a Modular Architecture for Software-engineering AI (MASAI) agents, where different LLM-powered sub-agents are instantiated with well-defined objectives and strategies tuned to achieve those objectives. Our modular architecture offers several advantages: (1) employing and tuning different problem-solving strategies across sub-agents, (2) enabling sub-agents to gather information from different sources scattered throughout a repository, and (3) avoiding unnecessarily long trajectories which inflate costs and add extraneous context. MASAI enabled us to achieve the highest performance (28.33% resolution rate) on the popular and highly challenging SWE-bench Lite dataset consisting of 300 GitHub issues from 11 Python repositories. We conduct a comprehensive evaluation of MASAI relative to other agentic methods and analyze the effects of our design decisions and their contribution to the success of MASAI.
Artificial Intelligence,Software Engineering
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper aims to address the automation of solving complex problems in software engineering. Specifically, the authors propose a modular architecture (MASAI) for building AI agents in software engineering. MASAI solves problems by decomposing complex issues into multiple sub-problems and instantiating sub-agents with clear goals and strategies for each sub-problem. The main objectives of this approach include: 1. **Adopting different problem-solving strategies**: Different sub-agents can use different strategies (such as ReAct or CoT) to solve specific sub-problems. 2. **Collecting information from different sources**: Sub-agents can gather information from different files in the code repository. 3. **Avoiding unnecessary long reasoning paths**: This can reduce reasoning costs and avoid passing redundant information, thereby improving performance. ### Main Contributions 1. **Proposing a modular architecture**: MASAI allows for the individual optimization of sub-agent designs while combining them to solve larger end-to-end software engineering tasks. 2. **Demonstrating the effectiveness of MASAI**: MASAI achieved the highest problem-solving rate (28.33%) on the SWE-bench Lite dataset. 3. **Detailed analysis of design decisions**: A thorough investigation of the key design decisions of MASAI and existing methods to guide future research and development. 4. **Contributing results**: Submitting results to the SWE-bench Lite leaderboard for validation. ### Experimental Setup - **Dataset**: SWE-bench Lite, containing 300 software engineering tasks from 11 open-source repositories, mainly focusing on bug fixes. - **Metrics**: Resolution (percentage of problems successfully solved), localization rate (percentage of proposed patches that fully cover the real patch files), and application rate (percentage of proposed patches successfully applied to the repository). - **Comparison methods**: Compared with various existing methods, including SWE-agent, AutoCodeRover, OpenDevin, Aider, CodeR, Moatless, RAG, etc. ### Results - **Performance**: MASAI achieved the highest resolution (28.33%) on the SWE-bench Lite dataset, tied with CodeR. - **Assumptions**: MASAI avoids reliance on external signals (such as expert hints), relying only on the standard SWE-bench Lite setup. - **Fault localization**: MASAI's Edit Localizer successfully localized the problem in 75% of cases, while other methods had lower localization rates. - **Multi-step reasoning**: Through ReAct strategy and tool design, Edit Localizer can flexibly and robustly perform multi-step reasoning to effectively locate problems. ### Examples - **Example 1**: In the scikit-learn__scikit-learn-13142 task, Edit Localizer found the root cause of the problem through multi-step reasoning. - **Example 2**: In the astropy__astropy-14995 task, the READ action returned an approximate match, helping the sub-agent find the actual faulty function. - **Example 3**: In the matplotlib__matplotlib-25332 task, basic shell commands helped Edit Localizer find the class that needed editing. ### Summary The paper demonstrates how to effectively solve problems in software engineering by proposing the MASAI modular architecture, which decomposes complex problems and uses different sub-agents. MASAI outperformed existing methods on the SWE-bench Lite dataset, proving its potential in automating software engineering tasks.