Abstract:Retrieval-augmented large language models (LLMs) have demonstrated efficacy in knowledge-intensive tasks such as open-domain QA, addressing inherent challenges in knowledge update and factual inadequacy. However, inconsistencies between retrieval knowledge and the necessary knowledge for LLMs, leading to a decline in LLM's answer quality. This paper introduces BIDER, an approach that refines retrieval documents into Key Supporting Evidence (KSE) through knowledge synthesis, supervised fine-tuning (SFT), and preference alignment. We train BIDER by learning from crafting KSE, while maximizing its output to align with LLM's information acquisition preferences through reinforcement learning. Evaluations across five datasets show BIDER boosts LLMs' answer quality by 7% while reducing input content length in retrieval documents by 80%, outperforming existing methods. The proposed KSE simulation effectively equips LLMs with essential information for accurate question answering.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address a key issue faced by Retrieval-Augmented Large Language Models (RAG) in knowledge-intensive tasks: the inconsistency between the retrieved knowledge and the knowledge required by the model. Specifically, although existing RAG methods can enhance generation quality through external knowledge sources, the retrieved documents often contain a lot of redundant and noisy information due to the imperfections of the retrieval system and the inaccessibility of the model's own knowledge. This not only increases the input length but may also reduce the quality of the generated answers. ### Background and Motivation Large Language Models (LLMs) perform excellently in handling knowledge-intensive tasks, but they face challenges in knowledge updating and providing factual answers. To address these issues, Retrieval-Augmented Generation (RAG) has become a promising approach by introducing external knowledge to improve the quality and reliability of generated answers. However, existing RAG methods often fail to effectively remove noise when processing retrieved documents, leading to a decline in generation quality. ### Solution To solve the above problems, the authors propose BIDER (BrIDging knowledge inconsistency for efficient Retrieval-augmented LLMs), a method to refine retrieved documents into Key Supporting Evidence (KSE). BIDER is trained through the following three stages: 1. **Knowledge Synthesis Stage**: Gradually synthesizing real KSE through a three-step method. - **Fragment Extraction**: Extracting fragments from retrieved documents that help answer the question. - **Fragment Refinement**: Retaining the minimal necessary set of fragments through an iterative selection method. - **Fragment Cleaning**: Further cleaning the candidate pool of fragments to avoid conflicts with the model's internal knowledge. 2. **Supervised Distillation Stage**: Building a seq2seq model to learn the mapping from retrieved documents to KSE. 3. **Preference Alignment Stage**: Using reinforcement learning techniques to align the model output with the information retrieval preferences of downstream LLMs, ensuring that the refined retrieved documents contain coherent and easy-to-understand key information. ### Experimental Results The authors evaluated the effectiveness of BIDER on five datasets, covering three types of knowledge-intensive tasks: open-domain question answering (NQ, TQA, HotpotQA), dialogue generation (WoW), and fact verification (FEVER). The experimental results show that BIDER significantly improves generation performance across all datasets while reducing the input information length by 80%, effectively compressing the retrieved documents. Particularly on the WoW dataset, the performance improvement is close to 40%. ### Main Contributions 1. Proposing a three-step knowledge synthesis method for generating real KSE. 2. Introducing a method to refine retrieved documents into KSE, bridging the inconsistency between retrieved documents and the knowledge required by the model. 3. Utilizing supervised distillation and preference alignment techniques to train the refinement model, effectively enhancing RAG's performance in the reasoning process, reducing input length, and improving answer quality. ### Conclusion By refining retrieved documents, BIDER effectively addresses the inconsistency between the retrieved knowledge and the knowledge required by the model in RAG methods, significantly improving the quality and efficiency of generated answers.

BIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence

KS-LLM: Knowledge Selection of Large Language Models with Evidence Document for Question Answering

General LLMs as Instructors for Domain-Specific LLMs: A Sequential Fusion Method to Integrate Extraction and Editing

Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering

Advancing Academic Knowledge Retrieval via LLM-enhanced Representation Similarity Fusion

Know where to go: Make LLM a relevant, responsible, and trustworthy searchers

Retrieve Anything To Augment Large Language Models

Efficient Knowledge Infusion via KG-LLM Alignment

LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning

Benchmarking Large Language Models in Evidence-Based Medicine

Bridging the Preference Gap between Retrievers and LLMs

Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment

LLM-Augmented Retrieval: Enhancing Retrieval Models Through Language Models and Doc-Level Embedding

Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering

JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability

Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration

Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs

Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling

FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering

An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration