Unleashing Multi-Hop Reasoning Potential in Large Language Models through Repetition of Misordered Context

Sangwon Yu,Ik-hwan Kim,Jongyoon Song,Saehyung Lee,Junsung Park,Sungroh Yoon
2024-10-10
Abstract:Multi-hop reasoning, which requires multi-step reasoning based on the supporting documents within a given context, remains challenging for large language models (LLMs). LLMs often struggle to filter out irrelevant documents within the context, and their performance is sensitive to the position of supporting documents within that context. In this paper, we identify an additional challenge: LLMs' performance is also sensitive to the order in which the supporting documents are presented. We refer to this as the misordered context problem. To address this issue, we propose a simple yet effective method called context repetition (CoRe), which involves prompting the model by repeatedly presenting the context to ensure the supporting documents are presented in the optimal order for the model. Using CoRe, we improve the F1 score by up to 30%p on multi-hop QA tasks and increase accuracy by up to 70%p on a synthetic task. Additionally, CoRe helps mitigate the well-known "lost-in-the-middle" problem in LLMs and can be effectively combined with retrieval-based approaches utilizing Chain-of-Thought (CoT) reasoning.
Computation and Language
What problem does this paper attempt to address?
This paper attempts to address several key challenges faced by large language models (LLMs) in multi-hop reasoning tasks. Specifically: 1. **Order Sensitivity of Supporting Documents**: The performance of LLMs is highly sensitive to the presentation order of supporting documents. If the order of the supporting documents is not conducive to the model's reasoning process, the model's performance may significantly decline. The authors refer to this issue as the "misordered context problem." 2. **Lost-in-the-Middle Problem**: When supporting documents are located in the middle of the context, the model may fail to correctly identify these documents, leading to a decline in reasoning performance. 3. **Interference from Noisy Documents**: LLMs struggle to filter out noisy documents that are irrelevant to the correct answer, which can severely impact the model's reasoning performance. To address these issues, the authors propose a simple yet effective method called "Context Repetition" (CoRe). By repeatedly presenting the context multiple times, they ensure that the supporting documents are presented to the model in the optimal order, thereby improving the model's performance in multi-hop reasoning tasks. ### Main Contributions 1. **Introduction of the Misordered Context Problem**: Revealed the impact of the order of supporting documents on the performance of LLMs and defined it as a key challenge. 2. **Theoretical Analysis and Method Proposal**: Theoretically proposed a method to solve this problem through context enhancement and introduced the CoRe method. 3. **Experimental Validation**: Conducted experiments on multiple multi-hop QA benchmark datasets and a synthetic task, demonstrating the effectiveness of the CoRe method. ### Experimental Results - On multi-hop QA datasets such as HotpotQA, 2WikiMultihopQA, and MuSiQue, the CoRe method significantly improved the model's F1 score, with an increase of up to 30%. - In the synthetic task, the CoRe method significantly improved the model's accuracy, with an increase of up to 70%. - The CoRe method not only improved the model's performance in multi-hop reasoning tasks but also alleviated the "Lost-in-the-Middle Problem" and can be combined with retrieval-based methods. ### Conclusion This study effectively addresses the sensitivity of LLMs to the order of supporting documents in multi-hop reasoning tasks by introducing the CoRe method, thereby improving the model's reasoning performance, especially in complex multi-hop reasoning tasks.