GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation

Brendan Hogan Rappazzo,Yingheng Wang,Aaron Ferber,Carla Gomes
2024-09-24
Abstract:The ability to form, retrieve, and reason about memories in response to stimuli serves as the cornerstone for general intelligence - shaping entities capable of learning, adaptation, and intuitive insight. Large Language Models (LLMs) have proven their ability, given the proper memories or context, to reason and respond meaningfully to stimuli. However, they are still unable to optimally encode, store, and retrieve memories - the ability to do this would unlock their full ability to operate as AI agents, and to specialize to niche domains. To remedy this, one promising area of research is Retrieval Augmented Generation (RAG), which aims to augment LLMs by providing them with rich in-context examples and information. In question-answering (QA) applications, RAG methods embed the text of interest in chunks, and retrieve the most relevant chunks for a prompt using text embeddings. Motivated by human memory encoding and retrieval, we aim to improve over standard RAG methods by generating and encoding higher-level information and tagging the chunks by their utility to answer questions. We introduce Graphical Eigen Memories For Retrieval Augmented Generation (GEM-RAG). GEM-RAG works by tagging each chunk of text in a given text corpus with LLM generated ``utility'' questions, connecting chunks in a graph based on the similarity of both their text and utility questions, and then using the eigendecomposition of the memory graph to build higher level summary nodes that capture the main themes of the text. We evaluate GEM-RAG, using both UnifiedQA and GPT-3.5 Turbo as the LLMs, with SBERT, and OpenAI's text encoders on two standard QA tasks, showing that GEM-RAG outperforms other state-of-the-art RAG methods on these tasks. We also discuss the implications of having a robust RAG system and future directions.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is how to enhance the memory capabilities of large language models (LLMs) to enable them to encode, store, and retrieve information more effectively. Specifically, the paper proposes a method called GEM-RAG, which aims to improve the traditional Retrieval-Augmented Generation (RAG) method by generating and encoding higher-level information and tagging text blocks based on the utility of this information. GEM-RAG achieves this by constructing a weighted graph based on the similarity of text blocks and their utility to the query, and using spectral decomposition of the graph to generate high-level summary nodes. This approach not only provides a more systematic method for RAG tasks but also synthesizes Graph-based Episodic Memory (GEM), which helps explore the text and understand which components are relevant to a given query. The main contributions of the paper include: 1. Proposing a new RAG system inspired by human cognition that encodes, stores, and retrieves information based on its utility. 2. Formalizing the process of generating summary nodes as a random walk or spectral decomposition problem, using the eigenvectors of the graph to generate summary nodes. 3. Demonstrating the effectiveness of their RAG method on 2 question-answering datasets, using various embedding models and language models, and conducting ablation studies to better understand the method's impact. 4. Releasing an interactive web demo example GEM, showcasing how the graph works and highlighting its utility as a standalone object.