MEMORYLLM: Towards Self-Updatable Large Language Models

Yu Wang,Xiusi Chen,Jingbo Shang,Julian McAuley

2024-02-07

Abstract:Existing Large Language Models (LLMs) usually remain static after deployment, which might make it hard to inject new knowledge into the model. We aim to build models containing a considerable portion of self-updatable parameters, enabling the model to integrate new knowledge effectively and efficiently. To this end, we introduce MEMORYLLM, a model that comprises a transformer and a fixed-size memory pool within the latent space of the transformer. MEMORYLLM can self-update with text knowledge and memorize the knowledge injected earlier. Our evaluations demonstrate the ability of MEMORYLLM to effectively incorporate new knowledge, as evidenced by its performance on model editing benchmarks. Meanwhile, the model exhibits long-term information retention capacity, which is validated through our custom-designed evaluations and long-context benchmarks. MEMORYLLM also shows operational integrity without any sign of performance degradation even after nearly a million memory updates.

Computer Science

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: How to design a large - language model (LLM) so that it can efficiently integrate new knowledge while minimizing the forgetting of previously learned knowledge? Specifically, the paper poses challenges in the following aspects: 1. **Efficiency**: The knowledge - injection process should be as simplified as possible. Ideally, the need for back - propagation can be eliminated to improve efficiency. 2. **Effectiveness**: Ensure that new knowledge can be effectively injected into the model and have a positive impact on the model's performance. 3. **Knowledge retention**: The model has a memory pool of a fixed size, which means that the memory capacity is fixed. Therefore, a mechanism is required to gradually phase out old knowledge. 4. **Integrity**: No matter how many times the memory pool is updated, the model must maintain all of its functions. 5. **Non - redundancy**: The goal is to achieve more compact knowledge storage, reduce redundancy, and optimize memory usage. To address these challenges, the authors introduced the MEMORY LLM model, which embeds a fixed - size memory pool in the latent space of the LLM. This memory pool is designed to manage the integration of new knowledge and encourage minimal information forgetting, while avoiding the problem of unlimited growth through its fixed size.

MEMORYLLM: Towards Self-Updatable Large Language Models

MEMORYLLM: Towards Self-Updatable Large Language Models

Augmenting Language Models with Long-Term Memory

Self-Updatable Large Language Models with Parameter Integration

Enhancing Large Language Model with Self-Controlled Memory Framework

MemLLM: Finetuning LLMs to Use An Explicit Read-Write Memory

Large Language Models Are Semi-Parametric Reinforcement Learning Agents

RecallM: An Adaptable Memory Mechanism with Temporal Understanding for Large Language Models

RET-LLM: Towards a General Read-Write Memory for Large Language Models

Large Language Models with Controllable Working Memory

Schrodinger's Memory: Large Language Models

CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models

Needle in the Haystack for Memory Based Large Language Models

MemoryBank: Enhancing Large Language Models with Long-Term Memory

$\text{Memory}^3$: Language Modeling with Explicit Memory

CAMELoT: Towards Large Language Models with Training-Free Consolidated Associative Memory

Empowering Working Memory for Large Language Model Agents

MindLLM: Pre-training Lightweight Large Language Model from Scratch, Evaluations and Domain Applications

Self-evolving Agents with reflective and memory-augmented abilities

UniMem: Towards a Unified View of Long-Context Large Language Models