ActiveRAG: Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents

Zhipeng Xu,Zhenghao Liu,Yukun Yan,Shuo Wang,Shi Yu,Zheni Zeng,Chaojun Xiao,Zhiyuan Liu,Ge Yu,Chenyan Xiong

2024-10-17

Abstract:Retrieval-Augmented Generation (RAG) enables Large Language Models (LLMs) to leverage external knowledge, enhancing their performance on knowledge-intensive tasks. However, existing RAG models often treat LLMs as passive recipients of information, which can lead to interference from noisy retrieved content. In this paper, we introduce ActiveRAG, a multi-agent framework that mimics human learning behavior to help LLMs actively engage with and learn from retrieved evidence. ActiveRAG designs a knowledge assimilation agent to form the knowledge understanding by associating external knowledge with the parametric memory of LLMs. Then our model employs the thought accommodation agent to calibrate the internal thought of LLMs for response refinement. Our experiments show that ActiveRAG achieves a 10\% improvement over vanilla RAG on various question-answering benchmarks. Further analysis reveals that ActiveRAG mitigates the impact of noisy retrievals, alleviates conflicts between external knowledge and parametric memory and improves the self-consistency of LLMs in answering the question. All data and codes are available at <a class="link-external link-https" href="https://github.com/OpenMatch/ActiveRAG" rel="external noopener nofollow">this https URL</a>.

Computation and Language

What problem does this paper attempt to address?

The paper attempts to address the limitations of existing Retrieval-Augmented Generation (RAG) models in utilizing external knowledge. Specifically, current RAG models often treat large language models (LLMs) as passive recipients of information, which can lead to the introduction of noise from the retrieved content, thereby affecting the model's performance. Additionally, these models' performance can also be impacted when external knowledge conflicts with the LLMs' parameter memory. To address these issues, the paper proposes ACTIVE RAG, a multi-agent framework that helps LLMs actively interact with and learn from the retrieved evidence by mimicking human learning behavior. ACTIVE RAG designs a knowledge assimilation agent to associate external knowledge with the LLMs' parameter memory, forming knowledge understanding. Then, through a thought adaptation agent, it calibrates the LLMs' internal thinking to improve the accuracy of responses. Experimental results show that ACTIVE RAG improves performance by approximately 10% over traditional RAG models in various question-answering benchmarks. It is also able to mitigate the impact of noisy retrieved content, alleviate conflicts between external knowledge and parameter memory, and enhance the consistency of LLMs in answering questions.

ActiveRAG: Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents

Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models

WeKnow-RAG: An Adaptive Approach for Retrieval-Augmented Generation Integrating Web Search and Knowledge Graphs

Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models

AssistRAG: Boosting the Potential of Large Language Models with an Intelligent Information Assistant

Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

Retrieval-Augmented Generation for Large Language Models: A Survey

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains

Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs

Meta Knowledge for Retrieval Augmented Large Language Models

RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

LightRAG: Simple and Fast Retrieval-Augmented Generation

Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models

RAG-Studio: Towards In-Domain Adaptation of Retrieval Augmented Generation Through Self-Alignment

Enhancing Retrieval and Managing Retrieval: A Four-Module Synergy for Improved Quality and Efficiency in RAG Systems

REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering