Abstract:Autonomous agents powered by large language models (LLMs) show significant potential for achieving high autonomy in various scenarios such as software development. Recent research has shown that LLM agents can leverage past experiences to reduce errors and enhance efficiency. However, the static experience paradigm, reliant on a fixed collection of past experiences acquired heuristically, lacks iterative refinement and thus hampers agents' adaptability. In this paper, we introduce the Iterative Experience Refinement framework, enabling LLM agents to refine experiences iteratively during task execution. We propose two fundamental patterns: the successive pattern, refining based on nearest experiences within a task batch, and the cumulative pattern, acquiring experiences across all previous task batches. Augmented with our heuristic experience elimination, the method prioritizes high-quality and frequently-used experiences, effectively managing the experience space and enhancing efficiency. Extensive experiments show that while the successive pattern may yield superior results, the cumulative pattern provides more stable performance. Moreover, experience elimination facilitates achieving better performance using just 11.54% of a high-quality subset.

What problem does this paper attempt to address?

### Problems the paper attempts to solve The paper aims to solve the problem of how autonomous agents based on large - language models (LLMs) can effectively utilize and iteratively improve past experiences when performing tasks. Specifically, existing methods rely on a static experience paradigm, that is, collecting a fixed number of historical experiences at one time to guide future task execution. However, this static experience paradigm lacks an iterative optimization mechanism, resulting in insufficient adaptability of agents when dealing with complex tasks such as software development. To solve this problem, the paper introduces a new **Iterative Experience Refinement (IER) framework**, enabling agents to dynamically acquire, utilize, and eliminate experiences during task execution. The IER framework is implemented through two basic modes: 1. **Successive Pattern**: Optimize based on experiences in the most recent task batches. 2. **Cumulative Pattern**: Integrate experiences from all historical task batches. In addition, in order to prevent the disorderly expansion of the experience space, the paper also proposes a heuristic experience elimination mechanism, giving priority to retaining high - quality and frequently - used experiences, thereby improving the efficiency of experience management. ### Main contributions 1. **Propose the iterative experience optimization framework for the first time**: This framework enables agents to adaptively solve new tasks by dynamically acquiring, utilizing, and eliminating experiences. 2. **Propose an experience elimination mechanism**: Give priority to retaining high - quality and frequently - used experiences, reducing inefficiency problems caused by the expansion of the experience space. 3. **Experimental verification**: Through extensive experiments, it is proved that the successive pattern may perform better on some indicators, while the cumulative pattern provides more stable performance. At the same time, the experience elimination mechanism can achieve better performance by only retaining a 11.54% high - quality experience subset. ### Methodology - **Experience acquisition and utilization**: Through multi - round interactions of instructions and solutions, record and extract effective "shortcut" experiences. - **Experience propagation**: Through the successive pattern and the cumulative pattern, transfer experiences from one task batch to the next. - **Experience elimination**: Based on information density and usage frequency, eliminate low - quality experiences and retain high - quality experiences. ### Experimental evaluation - **Baseline methods**: Including GPTEngineer, MetaGPT, ChatDev, and ECL, etc. - **Dataset**: SRDD dataset, which contains 1,200 software requirement descriptions and is divided into 6 task batches. - **Evaluation indicators**: - Completeness - Executability - Consistency - Quality The experimental results show that the IER framework is significantly superior to other baseline methods on multiple indicators, and improves the quality and efficiency of software generation without significantly increasing the task execution time.

Iterative Experience Refinement of Software-Developing Agents

AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback

Experiential Co-Learning of Software-Developing Agents

Watch Every Step! LLM Agent Learning via Iterative Step-Level Process Refinement

360^∘REA: Towards A Reusable Experience Accumulation with 360 Assessment for Multi-Agent System

An Evaluation-Driven Approach to Designing LLM Agents: Process and Architecture

360$^\circ$REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System

From Language Models to Practical Self-Improving Computer Agents

A Multi-AI Agent System for Autonomous Optimization of Agentic AI Solutions via Iterative Refinement and LLM-Driven Feedback Loops

ExpeL: LLM Agents Are Experiential Learners

Self-evolving Agents with reflective and memory-augmented abilities

Agent-Driven Automatic Software Improvement

Large Language Models Can Self-Improve At Web Agent Tasks

Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

Large Language Model As Autonomous Decision Maker

Training Language Model Agents without Modifying Language Models

Enhancing LLMs for Power System Simulations: A Feedback-driven Multi-agent Framework

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation

Recursive Introspection: Teaching Language Model Agents How to Self-Improve