ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems

Ishneet Sukhvinder Singh,Ritvik Aggarwal,Ibrahim Allahverdiyev,Muhammad Taha,Aslihan Akalin,Kevin Zhu,Sean O'Brien
2024-10-30
Abstract:Retrieval-Augmented Generation (RAG) systems using large language models (LLMs) often generate inaccurate responses due to the retrieval of irrelevant or loosely related information. Existing methods, which operate at the document level, fail to effectively filter out such content. We propose LLM-driven chunk filtering, ChunkRAG, a framework that enhances RAG systems by evaluating and filtering retrieved information at the chunk level. Our approach employs semantic chunking to divide documents into coherent sections and utilizes LLM-based relevance scoring to assess each chunk's alignment with the user's query. By filtering out less pertinent chunks before the generation phase, we significantly reduce hallucinations and improve factual accuracy. Experiments show that our method outperforms existing RAG models, achieving higher accuracy on tasks requiring precise information retrieval. This advancement enhances the reliability of RAG systems, making them particularly beneficial for applications like fact-checking and multi-hop reasoning.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the inaccuracy of generated responses in Retrieval - Augmented Generation (RAG) systems using large - language models (LLMs) due to the retrieval of irrelevant or weakly - related information. Existing methods usually operate at the document level and cannot effectively filter such content, resulting in factual errors, irrelevant information, and hallucination phenomena in the generated responses. The paper proposes a new method - Chunk - Filtering Driven by LLM (ChunkRAG), which significantly reduces hallucination phenomena and improves factual accuracy by evaluating and filtering retrieved information at the block level. Specifically, the paper points out that when traditional RAG systems retrieve a large amount of text, they often assume that these long segments contain relevant information, but rarely examine each part or paragraph of the retrieved document separately. Therefore, it is very likely to bring irrelevant or partially - relevant information into the generation stage. In addition, the ability of language models to generate fluent text cannot verify the generated information they use, which further exacerbates the problem. These problems are particularly serious in key applications such as open - domain question answering and multi - hop reasoning, as these applications require a high degree of reliability. To solve these problems, the paper proposes the ChunkRAG framework. This framework divides documents into coherent parts through semantic chunking and uses LLM - based relevance scores to evaluate the alignment of each block with the user query. By filtering out less - relevant blocks before the generation stage, this method significantly reduces hallucination phenomena and improves factual accuracy. Experimental results show that this method outperforms existing RAG models in tasks requiring precise information retrieval, improves the reliability of the system, and is particularly suitable for tasks such as fact - checking and multi - hop reasoning.