Abstract:Retrieval-Augmented Generation (RAG) enhances language models by retrieving and incorporating relevant external knowledge. However, traditional retrieve-and-generate processes may not be optimized for real-world scenarios, where queries might require multiple retrieval steps or none at all. In this paper, we propose a Probing-RAG, which utilizes the hidden state representations from the intermediate layers of language models to adaptively determine the necessity of additional retrievals for a given query. By employing a pre-trained prober, Probing-RAG effectively captures the model's internal cognition, enabling reliable decision-making about retrieving external documents. Experimental results across five open-domain QA datasets demonstrate that Probing-RAG outperforms previous methods while reducing the number of redundant retrieval steps.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper aims to address several key issues in existing Retrieval-Augmented Generation (RAG) systems: 1. **Multi-step Retrieval Requirement**: Traditional RAG methods typically employ a single-step retrieval strategy. However, for some complex queries, a single-step retrieval may not be sufficient to obtain enough external knowledge to generate accurate answers. These problems may require multi-step retrieval. 2. **Redundant Retrieval Steps**: In some cases, the language model already possesses enough internal knowledge to answer the question without external retrieval. However, existing RAG methods often fail to effectively identify these situations, leading to unnecessary retrieval steps, increasing computational overhead, and potentially causing knowledge conflicts. 3. **Knowledge Conflicts**: When the knowledge retrieved externally is inconsistent with the model's internal parameter knowledge, it may lead to inconsistencies in the output. Ideally, the language model should be able to detect and resolve these conflicts, but current methods perform poorly in this regard. 4. **Adaptive Adjustment of Retrieval Times**: Existing methods, when deciding whether to perform retrieval, typically rely on external classifiers or the model's final output, failing to fully utilize the model's internal representations. This results in decisions that are not flexible or accurate enough. ### Solution To address the above issues, the paper proposes the **Probing-RAG** method, which has the following main features: - **Internal State Prober**: By introducing a prober in the intermediate layers of the language model, Probing-RAG can assess whether the model needs additional retrieval. The prober uses the model's hidden states to make decisions, thereby more accurately determining when retrieval is necessary. - **Dynamic Adjustment of Retrieval Times**: Probing-RAG can dynamically decide whether to perform retrieval based on the model's internal state, avoiding unnecessary retrieval steps, reducing computational overhead, and improving the overall performance of the system. - **Reduction of Knowledge Conflicts**: By more accurately determining when retrieval is needed, Probing-RAG reduces conflicts between external knowledge and the model's internal knowledge, improving the consistency and accuracy of the answers. ### Experimental Results Experimental results show that Probing-RAG outperforms existing methods on multiple open-domain question-answering datasets, not only improving accuracy but also significantly reducing unnecessary retrieval steps. Specifically: - **Accuracy Improvement**: Probing-RAG improves accuracy by approximately 6.59% and 8.35% over no retrieval and single-step retrieval methods, respectively, across multiple datasets. - **Reduction of Redundant Retrieval**: The number of retrievals by Probing-RAG is significantly lower than other methods. For example, the retrieval counts of FLARE and DRAGIN are 2.67 times and 6.83 times that of Probing-RAG, respectively. ### Conclusion Probing-RAG effectively addresses the issues of multi-step retrieval requirement, redundant retrieval steps, and knowledge conflicts in existing RAG methods by introducing an internal state prober, significantly improving the performance and efficiency of the system.

Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval

Toward Optimal Search and Retrieval for RAG

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Enhancing Retrieval Processes for Language Generation with Augmented Queries

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models

Retrieval-Augmented Generation for Large Language Models: A Survey

From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

Corrective Retrieval Augmented Generation

Optimizing Query Generation for Enhanced Document Retrieval in RAG

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

A Multi-Source Retrieval Question Answering Framework Based on RAG

RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation

InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales

RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation

A Survey on Retrieval-Augmented Text Generation for Large Language Models

DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering

Enhancing Retrieval and Managing Retrieval: A Four-Module Synergy for Improved Quality and Efficiency in RAG Systems

Learning When to Retrieve, What to Rewrite, and How to Respond in Conversational QA