IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

Diji Yang,Jinmeng Rao,Kezhen Chen,Xiaoyuan Guo,Yawen Zhang,Jie Yang,Yi Zhang

2024-05-15

Abstract:Although the Retrieval-Augmented Generation (RAG) paradigms can use external knowledge to enhance and ground the outputs of Large Language Models (LLMs) to mitigate generative hallucinations and static knowledge base problems, they still suffer from limited flexibility in adopting Information Retrieval (IR) systems with varying capabilities, constrained interpretability during the multi-round retrieval process, and a lack of end-to-end optimization. To address these challenges, we propose a novel LLM-centric approach, IM-RAG, that integrates IR systems with LLMs to support multi-round RAG through learning Inner Monologues (IM, i.e., the human inner voice that narrates one's thoughts). During the IM process, the LLM serves as the core reasoning model (i.e., Reasoner) to either propose queries to collect more information via the Retriever or to provide a final answer based on the conversational context. We also introduce a Refiner that improves the outputs from the Retriever, effectively bridging the gap between the Reasoner and IR modules with varying capabilities and fostering multi-round communications. The entire IM process is optimized via Reinforcement Learning (RL) where a Progress Tracker is incorporated to provide mid-step rewards, and the answer prediction is further separately optimized via Supervised Fine-Tuning (SFT). We conduct extensive experiments with the HotPotQA dataset, a popular benchmark for retrieval-based, multi-step question-answering. The results show that our approach achieves state-of-the-art (SOTA) performance while providing high flexibility in integrating IR modules as well as strong interpretability exhibited in the learned inner monologues.

Computation and Language,Artificial Intelligence,Information Retrieval

What problem does this paper attempt to address?

This paper proposes a solution to the limitations of large language models (LLMs) in generative hallucination and static knowledge problems. Although current methods can improve the accuracy and reliability of LLMs' outputs through information retrieval (IR) systems, they lack flexibility, interpretability in multi-round retrieval processes, and end-to-end optimization. Therefore, the paper introduces a new approach called IM-RAG, which supports multi-round retrieval-enhanced generation by learning "inner monologue" (IM). The core of IM-RAG is a reasoning model (Reasoner) based on LLMs, which can generate queries as needed to obtain more information or provide the final answer based on the dialogue context. In addition, a Refiner component is introduced to improve the output of the retriever, bridging the gap between different capability reasoning modules and IR modules, facilitating multi-round interaction. The entire IM process is optimized through reinforcement learning (RL) and utilizes a progress tracker to provide intermediate step rewards. Finally, answer prediction is further optimized through supervised fine-tuning (SFT). Experiments are conducted on the HotPotQA dataset, which is used to evaluate retrieval-based multi-step question-answering tasks. The results show that the IM-RAG method achieves state-of-the-art performance while providing high flexibility to integrate IR modules with different capabilities, and demonstrates strong interpretability in the learned inner monologue.

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains

Retrieval-Augmented Generation for Large Language Models: A Survey

ActiveRAG: Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

Invar-RAG: Invariant LLM-aligned Retrieval for Better Generation

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation

Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA

Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation

RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

AssistRAG: Boosting the Potential of Large Language Models with an Intelligent Information Assistant

DMQR-RAG: Diverse Multi-Query Rewriting for RAG

Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation

Fine-Grained Guidance for Retrievers: Leveraging LLMs' Feedback in Retrieval-Augmented Generation

SFR-RAG: Towards Contextually Faithful LLMs

DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation