AI-native Memory: A Pathway from LLMs Towards AGI

Jingbo Shang,Zai Zheng,Jiale Wei,Xiang Ying,Felix Tao,Mindverse Team
2024-08-28
Abstract:Large language models (LLMs) have demonstrated the world with the sparks of artificial general intelligence (AGI). One opinion, especially from some startups working on LLMs, argues that an LLM with nearly unlimited context length can realize AGI. However, they might be too optimistic about the long-context capability of (existing) LLMs -- (1) Recent literature has shown that their effective context length is significantly smaller than their claimed context length; and (2) Our reasoning-in-a-haystack experiments further demonstrate that simultaneously finding the relevant information from a long context and conducting (simple) reasoning is nearly impossible. In this paper, we envision a pathway from LLMs to AGI through the integration of \emph{memory}. We believe that AGI should be a system where LLMs serve as core processors. In addition to raw data, the memory in this system would store a large number of important conclusions derived from reasoning processes. Compared with retrieval-augmented generation (RAG) that merely processing raw data, this approach not only connects semantically related information closer, but also simplifies complex inferences at the time of querying. As an intermediate stage, the memory will likely be in the form of natural language descriptions, which can be directly consumed by users too. Ultimately, every agent/person should have its own large personal model, a deep neural network model (thus \emph{AI-native}) that parameterizes and compresses all types of memory, even the ones cannot be described by natural languages. Finally, we discuss the significant potential of AI-native memory as the transformative infrastructure for (proactive) engagement, personalization, distribution, and social in the AGI era, as well as the incurred privacy and security challenges with preliminary solutions.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is the exploration of the limitations of large language models (LLMs) in achieving artificial general intelligence (AGI) and proposing a new path that combines a memory mechanism to overcome these limitations. Specifically: 1. **Insufficient long-context processing capability of LLMs**: Although current LLMs claim to handle very long contexts, the actual effective context length is much shorter than their claimed length. Additionally, the ability of LLMs to retrieve relevant information from long texts and perform reasoning simultaneously is almost impossible to achieve. 2. **LLMs alone are insufficient to achieve AGI**: The paper argues that relying solely on LLMs with infinitely long contexts cannot achieve true AGI, as such models struggle to effectively utilize ultra-long inputs for complex reasoning tasks. 3. **Introducing a memory mechanism**: To compensate for the shortcomings of LLMs in long-context processing, the authors propose a method that combines a memory mechanism with LLMs. This memory is not just the storage of raw data but also includes important conclusions derived from reasoning, enabling the system to better handle complex tasks. In summary, the main contribution of the paper is to highlight the limitations of current LLMs and propose the integration of a memory mechanism to construct a system architecture that is closer to AGI.