Abstract:Multi-hop question answering is a challenging task with distinct industrial relevance, and Retrieval-Augmented Generation (RAG) methods based on large language models (LLMs) have become a popular approach to tackle this task. Owing to the potential inability to retrieve all necessary information in a single iteration, a series of iterative RAG methods has been recently developed, showing significant performance improvements. However, existing methods still face two critical challenges: context overload resulting from multiple rounds of retrieval, and over-planning and repetitive planning due to the lack of a recorded retrieval trajectory. In this paper, we propose a novel iterative RAG method called ReSP, equipped with a dual-function summarizer. This summarizer compresses information from retrieved documents, targeting both the overarching question and the current sub-question concurrently. Experimental results on the multi-hop question-answering datasets HotpotQA and 2WikiMultihopQA demonstrate that our method significantly outperforms the state-of-the-art, and exhibits excellent robustness concerning context length.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper mainly focuses on two key challenges in the multi - hop question answering task: 1. **Context Overload**: - During multiple retrieval processes, due to the need to handle longer document contexts, the iterative Retrieval - Augmented Generation (iterative RAG) method is prone to introduce more noise information, increasing the risk of the model missing key information when generating answers. 2. **Over - planning and Repetitive Planning**: - Existing iterative RAG methods lack the recording of retrieval trajectories, making it difficult for the model to determine whether it has obtained enough information to answer the main question or whether a sub - question has been retrieved. This will cause the model to continue generating new sub - questions when not needed (over - planning), or repeatedly generate sub - questions that have been retrieved (repetitive planning). To solve these problems, the authors propose a new iterative RAG method - ReSP (Retrieve, Summarize, Plan). ReSP compresses the information extracted from retrieved documents by introducing a dual - function summarizer, and simultaneously summarizes for the main question and the current sub - question. Specifically: - **Global Evidence Memory**: Used to store summaries of information related to the main question, helping the model determine when to stop iteration. - **Local Pathway Memory**: Used to store summaries of information related to the current sub - question, preventing repetitive planning. In this way, ReSP not only solves the problem of context overload but also optimizes the planning process in multi - hop question answering, avoiding over - planning and repetitive planning. ### Experimental Results The experimental results show that ReSP significantly outperforms existing single - round and iterative RAG methods on two multi - hop question answering datasets, HotpotQA and 2WikiMultihopQA. In particular, on HotpotQA, ReSP improves the F1 score by 4.1 compared to the existing state - of - the - art method (SOTA); on 2WikiMultihopQA, it improves by 5.9. In addition, ReSP shows good robustness when dealing with contexts of different lengths, and can maintain a stable and concise context in each iteration, thus ensuring that the generated answers are not affected by changes in document length.

Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach

EfficientRAG: Efficient Retriever for Multi-Hop Question Answering

Hierarchical Retrieval-Augmented Generation Model with Rethink for Multi-hop Question Answering

Locate then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering

A Multi-Source Retrieval Question Answering Framework Based on RAG

Layered Query Retrieval: an Adaptive Framework for Retrieval-Augmented Generation in Complex Question Answering for Large Language Models

MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries

Ask to Understand: Question Generation for Multi-hop Question Answering

Review-Then-Refine: A Dynamic Framework for Multi-Hop Question Answering with Temporal Adaptability

IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues

Memory Augmented Sequential Paragraph Retrieval for Multi-hop Question Answering

Enhancing Multi-modal Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation

DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering

Enhancing Multi-modal and Multi-hop Question Answering via Structured Knowledge and Unified Retrieval-Generation

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation

LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering

A Graph-guided Multi-round Retrieval Method for Conversational Open-domain Question Answering

UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph

Refiner: Restructure Retrieval Content Efficiently to Advance Question-Answering Capabilities

Boosting Conversational Question Answering with Fine-Grained Retrieval-Augmentation and Self-Check