Abstract:Retrieval-Augmented Generation (RAG) systems using large language models (LLMs) often generate inaccurate responses due to the retrieval of irrelevant or loosely related information. Existing methods, which operate at the document level, fail to effectively filter out such content. We propose LLM-driven chunk filtering, ChunkRAG, a framework that enhances RAG systems by evaluating and filtering retrieved information at the chunk level. Our approach employs semantic chunking to divide documents into coherent sections and utilizes LLM-based relevance scoring to assess each chunk's alignment with the user's query. By filtering out less pertinent chunks before the generation phase, we significantly reduce hallucinations and improve factual accuracy. Experiments show that our method outperforms existing RAG models, achieving higher accuracy on tasks requiring precise information retrieval. This advancement enhances the reliability of RAG systems, making them particularly beneficial for applications like fact-checking and multi-hop reasoning.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the inaccuracy of generated responses in Retrieval - Augmented Generation (RAG) systems using large - language models (LLMs) due to the retrieval of irrelevant or weakly - related information. Existing methods usually operate at the document level and cannot effectively filter such content, resulting in factual errors, irrelevant information, and hallucination phenomena in the generated responses. The paper proposes a new method - Chunk - Filtering Driven by LLM (ChunkRAG), which significantly reduces hallucination phenomena and improves factual accuracy by evaluating and filtering retrieved information at the block level. Specifically, the paper points out that when traditional RAG systems retrieve a large amount of text, they often assume that these long segments contain relevant information, but rarely examine each part or paragraph of the retrieved document separately. Therefore, it is very likely to bring irrelevant or partially - relevant information into the generation stage. In addition, the ability of language models to generate fluent text cannot verify the generated information they use, which further exacerbates the problem. These problems are particularly serious in key applications such as open - domain question answering and multi - hop reasoning, as these applications require a high degree of reliability. To solve these problems, the paper proposes the ChunkRAG framework. This framework divides documents into coherent parts through semantic chunking and uses LLM - based relevance scores to evaluate the alignment of each block with the user query. By filtering out less - relevant blocks before the generation stage, this method significantly reduces hallucination phenomena and improves factual accuracy. Experimental results show that this method outperforms existing RAG models in tasks requiring precise information retrieval, improves the reliability of the system, and is particularly suitable for tasks such as fact - checking and multi - hop reasoning.

ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems

Improving Retrieval for RAG based Question Answering Models on Financial Documents

RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation

Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases

RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation

Optimizing Query Generation for Enhanced Document Retrieval in RAG

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers

Towards Understanding Retrieval Accuracy and Prompt Quality in RAG Systems

Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception

Mindful-RAG: A Study of Points of Failure in Retrieval Augmented Generation

Meta Knowledge for Retrieval Augmented Large Language Models

LightRAG: Simple and Fast Retrieval-Augmented Generation

SFR-RAG: Towards Contextually Faithful LLMs

Corrective Retrieval Augmented Generation

A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning

A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models

Refiner: Restructure Retrieval Content Efficiently to Advance Question-Answering Capabilities