Abstract:Retrieval-augmented generation (RAG) techniques leverage the in-context learning capabilities of large language models (LLMs) to produce more accurate and relevant responses. Originating from the simple 'retrieve-then-read' approach, the RAG framework has evolved into a highly flexible and modular paradigm. A critical component, the Query Rewriter module, enhances knowledge retrieval by generating a search-friendly query. This method aligns input questions more closely with the knowledge base. Our research identifies opportunities to enhance the Query Rewriter module to Query Rewriter+ by generating multiple queries to overcome the Information Plateaus associated with a single query and by rewriting questions to eliminate Ambiguity, thereby clarifying the underlying intent. We also find that current RAG systems exhibit issues with Irrelevant Knowledge; to overcome this, we propose the Knowledge Filter. These two modules are both based on the instruction-tuned Gemma-2B model, which together enhance response quality. The final identified issue is Redundant Retrieval; we introduce the Memory Knowledge Reservoir and the Retriever Trigger to solve this. The former supports the dynamic expansion of the RAG system's knowledge base in a parameter-free manner, while the latter optimizes the cost for accessing external knowledge, thereby improving resource utilization and response efficiency. These four RAG modules synergistically improve the response quality and efficiency of the RAG system. The effectiveness of these modules has been validated through experiments and ablation studies across six common QA datasets. The source code can be accessed at <a class="link-external link-https" href="https://github.com/Ancientshi/ERM4" rel="external noopener nofollow">this https URL</a>.

Generative Reader Optimization in the RAG-System

Improving Reading Comprehension Question Generation with Data Augmentation and Overgenerate-and-rank

Enhancing Retrieval and Managing Retrieval: A Four-Module Synergy for Improved Quality and Efficiency in RAG Systems

Towards Optimizing a Retrieval Augmented Generation using Large Language Model on Academic Data

The Chronicles of RAG: The Retriever, the Chunk and the Generator

An Adaptive Framework for Generating Systematic Explanatory Answer in Online Q&A Platforms

Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers

Advancing Question-Answering in Ophthalmology with Retrieval Augmented Generations (RAG): Benchmarking Open-source and Proprietary Large Language Models

Optimizing and Evaluating Enterprise Retrieval-Augmented Generation (RAG): A Content Design Perspective

Toward Optimal Search and Retrieval for RAG

RAGProbe: An Automated Approach for Evaluating RAG Applications

ChatQA: Surpassing GPT-4 on Conversational QA and RAG

RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

Towards Understanding Retrieval Accuracy and Prompt Quality in RAG Systems

Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage

Retrieval-Augmented Generation for Domain-Specific Question Answering: A Case Study on Pittsburgh and CMU

FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering

Advanced Retrieval Augmented Generation: Multilingual Semantic Retrieval across Document Types by Finetuning Transformer Based Language Models and OCR Integration

GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering

Revolutionizing Retrieval-Augmented Generation with Enhanced PDF Structure Recognition