MEMORYLLM: Towards Self-Updatable Large Language Models

Yu Wang,Xiusi Chen,Jingbo Shang,Julian McAuley
2024-02-07
Abstract:Existing Large Language Models (LLMs) usually remain static after deployment, which might make it hard to inject new knowledge into the model. We aim to build models containing a considerable portion of self-updatable parameters, enabling the model to integrate new knowledge effectively and efficiently. To this end, we introduce MEMORYLLM, a model that comprises a transformer and a fixed-size memory pool within the latent space of the transformer. MEMORYLLM can self-update with text knowledge and memorize the knowledge injected earlier. Our evaluations demonstrate the ability of MEMORYLLM to effectively incorporate new knowledge, as evidenced by its performance on model editing benchmarks. Meanwhile, the model exhibits long-term information retention capacity, which is validated through our custom-designed evaluations and long-context benchmarks. MEMORYLLM also shows operational integrity without any sign of performance degradation even after nearly a million memory updates.
Computer Science
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to design a large - language model (LLM) so that it can efficiently integrate new knowledge while minimizing the forgetting of previously learned knowledge? Specifically, the paper poses challenges in the following aspects: 1. **Efficiency**: The knowledge - injection process should be as simplified as possible. Ideally, the need for back - propagation can be eliminated to improve efficiency. 2. **Effectiveness**: Ensure that new knowledge can be effectively injected into the model and have a positive impact on the model's performance. 3. **Knowledge retention**: The model has a memory pool of a fixed size, which means that the memory capacity is fixed. Therefore, a mechanism is required to gradually phase out old knowledge. 4. **Integrity**: No matter how many times the memory pool is updated, the model must maintain all of its functions. 5. **Non - redundancy**: The goal is to achieve more compact knowledge storage, reduce redundancy, and optimize memory usage. To address these challenges, the authors introduced the MEMORY LLM model, which embeds a fixed - size memory pool in the latent space of the LLM. This memory pool is designed to manage the integration of new knowledge and encourage minimal information forgetting, while avoiding the problem of unlimited growth through its fixed size.