Abstract:Memory plays a pivotal role in enabling large language model~(LLM)-based agents to engage in complex and long-term interactions, such as question answering (QA) and dialogue systems. While various memory modules have been proposed for these tasks, the impact of different memory structures across tasks remains insufficiently explored. This paper investigates how memory structures and memory retrieval methods affect the performance of LLM-based agents. Specifically, we evaluate four types of memory structures, including chunks, knowledge triples, atomic facts, and summaries, along with mixed memory that combines these components. In addition, we evaluate three widely used memory retrieval methods: single-step retrieval, reranking, and iterative retrieval. Extensive experiments conducted across four tasks and six datasets yield the following key insights: (1) Different memory structures offer distinct advantages, enabling them to be tailored to specific tasks; (2) Mixed memory structures demonstrate remarkable resilience in noisy environments; (3) Iterative retrieval consistently outperforms other methods across various scenarios. Our investigation aims to inspire further research into the design of memory systems for LLM-based agents.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **The impact of different memory structures and memory retrieval methods on the performance of large language model (LLM) - based agents is still unclear**. Specifically, although previous research has proposed multiple memory modules for complex and long - term interactive tasks (such as question - answering systems and dialogue systems), the performance of these memory structures in different tasks and their impacts have not been fully explored. ### Main problems of the paper 1. **Advantages and applicability of different memory structures**: - The paper evaluates four types of memory structures: chunks, knowledge triples, atomic facts, and summaries, as well as mixed memory. - Through experiments, it verifies the performance of these memory structures in different tasks to determine their respective advantages and applicable scenarios. 2. **Effectiveness of memory retrieval methods**: - The paper evaluates three widely - used memory retrieval methods: single - step retrieval, reranking, and iterative retrieval. - It studies the performance of these retrieval methods on different tasks and datasets to determine which method is the most effective. 3. **Comprehensive evaluation and implications**: - Through extensive experiments on four tasks (multi - hop question answering, single - hop question answering, dialogue understanding, and reading comprehension) and six datasets, the paper reveals the following key insights: - Different memory structures have different advantages and can be customized according to specific tasks. - The mixed memory structure shows significant robustness in noisy environments. - The iterative retrieval method is consistently superior to other methods in various scenarios. ### Objective The objective of the paper is to reveal the impact of different memory structures and retrieval methods on the performance of LLM agents through systematic experiments and analysis, and to provide theoretical basis and practical guidance for designing more effective memory systems in the future. ### Formula representation To ensure the correctness and readability of formulas, some formulas involved in the paper are presented in Markdown format as follows: - **Single - step retrieval**: \[ M_r=\text{Retriever}(q, M_q, K) \] where \( M_r \) is the top \( K \) memories most relevant to the query \( q \). - **Reranking**: \[ M_r = \text{LLM}(q, M_i, R, P_{\text{Rerank}}) \] where \( M_i=\text{Retriever}(q, M_q, K) \) and \( M_r \) is the top \( R \) memories selected according to the relevance score. - **Iterative retrieval**: \[ q_j=\text{LLM}(M_j, P_{\text{Refine}}) \] \[ M_j=\text{Retriever}(q_{j - 1}, M_q, T) \] After \( N \) iterations, finally use \( q_N \) to retrieve the top \( K \) most relevant memories: \[ M_r=\text{Retriever}(q_N, M_q, K) \] Through these formulas, the paper shows how to improve the memory retrieval efficiency and accuracy of LLM agents through different retrieval methods.

On the Structural Memory of LLM Agents

A Survey on the Memory Mechanism of Large Language Model based Agents

Empowering Working Memory for Large Language Model Agents

Memory Sharing for Large Language Model based Agents

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

Schrodinger's Memory: Large Language Models

HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model

"My agent understands me better": Integrating Dynamic Human-like Memory Recall and Consolidation in LLM-Based Agents

RecallM: An Adaptable Memory Mechanism with Temporal Understanding for Large Language Models

$\text{Memory}^3$: Language Modeling with Explicit Memory

Think-in-Memory: Recalling and Post-thinking Enable LLMs with Long-Term Memory

MemoryBank: Enhancing Large Language Models with Long-Term Memory

Leave It to Large Language Models! Correction and Planning with Memory Integration

Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Beyond Memorization: The Challenge of Random Memory Access in Language Models

Enhancing Large Language Model with Self-Controlled Memory Framework

RET-LLM: Towards a General Read-Write Memory for Large Language Models

Disentangling Memory and Reasoning Ability in Large Language Models

QRMeM: Unleash the Length Limitation through Question then Reflection Memory Mechanism