Conversational Query Reformulation with the Guidance of Retrieved Documents

Jeonghyun Park,Hwanhee Lee
2024-09-20
Abstract:Conversational search seeks to retrieve relevant passages for the given questions in conversational question answering. Conversational Query Reformulation (CQR) improves conversational search by refining the original queries into de-contextualized forms to resolve the issues in the original queries, such as omissions and coreferences. Previous CQR methods focus on imitating human written queries which may not always yield meaningful search results for the retriever. In this paper, we introduce GuideCQR, a framework that refines queries for CQR by leveraging key information from the initially retrieved documents. Specifically, GuideCQR extracts keywords and generates expected answers from the retrieved documents, then unifies them with the queries after filtering to add useful information that enhances the search process. Experimental results demonstrate that our proposed method achieves state-of-the-art performance across multiple datasets, outperforming previous CQR methods. Additionally, we show that GuideCQR can get additional performance gains in conversational search using various types of queries, even for queries written by humans.
Computation and Language
What problem does this paper attempt to address?
### The Problem Addressed by the Paper This paper primarily addresses several key challenges in **Conversational Query Reformulation (CQR)**, specifically including: 1. **Query Incompleteness and Coreference Issues**: In the task of Conversational Question Answering (ConvQA), original queries often contain missing information or coreference phenomena, making it difficult to obtain ideal search results by directly using these queries. 2. **Improving Retrieval Performance**: Existing CQR methods usually focus on mimicking human-written queries. Although these queries may be easier to understand, they do not necessarily optimize the performance of the retrieval system. The paper proposes a new framework called **GuideCQR**, which utilizes key information from initially retrieved documents to improve the query reformulation process. Specifically, GuideCQR extracts keywords from the retrieved documents and generates expected answers, then combines this information with the original query to enhance the quality of the query. Experimental results show that this method achieves state-of-the-art performance on multiple datasets and performs well across various types of queries, even surpassing human-rewritten queries.