Generative Query Reformulation Using Ensemble Prompting, Document Fusion, and Relevance Feedback

Kaustubh D. Dhole,Ramraj Chandradevan,Eugene Agichtein
2024-05-28
Abstract:Query Reformulation (QR) is a set of techniques used to transform a user's original search query to a text that better aligns with the user's intent and improves their search experience. Recently, zero-shot QR has been a promising approach due to its ability to exploit knowledge inherent in large language models. Inspired by the success of ensemble prompting strategies which have benefited other tasks, we investigate if they can improve query reformulation. In this context, we propose two ensemble-based prompting techniques, GenQREnsemble and GenQRFusion which leverage paraphrases of a zero-shot instruction to generate multiple sets of keywords to improve retrieval performance ultimately. We further introduce their post-retrieval variants to incorporate relevance feedback from a variety of sources, including an oracle simulating a human user and a "critic" LLM. We demonstrate that an ensemble of query reformulations can improve retrieval effectiveness by up to 18% on nDCG@10 in pre-retrieval settings and 9% on post-retrieval settings on multiple benchmarks, outperforming all previously reported SOTA results. We perform subsequent analyses to investigate the effects of feedback documents, incorporate domain-specific instructions, filter reformulations, and generate fluent reformulations that might be more beneficial to human searchers. Together, the techniques and the results presented in this paper establish a new state of the art in automated query reformulation for retrieval and suggest promising directions for future research.
Information Retrieval,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the vocabulary mismatch in search queries, that is, the user's initial query may not accurately express their information needs, resulting in unsatisfactory query results. Specifically, the paper focuses on how to improve the user's search experience through Query Reformulation (QR) technology. Query reformulation technology aims to transform the user's original query into a text that is more in line with the user's intention, thereby improving the relevance and quality of search results. ### Background and Motivation 1. **Limitations of User Queries**: - When conducting a search, a user may have unsatisfactory search results due to a lack of vocabulary knowledge in a specific field, or because the query is vaguely and unclearly expressed. - Query reformulation technology can better capture the user's actual needs by expanding query terms or reformulating the query, thereby improving the search experience. 2. **Potential of Zero - Shot Query Reformulation**: - In recent years, large - scale language models (LLMs) have performed excellently in zero - shot learning and are able to generate new queries or keywords using the knowledge within the model. - The zero - shot query reformulation method can generate effective query reformulations without labeled data through simple prompts, having high flexibility and practicality. ### Main Contributions of the Paper 1. **Proposing Two Query Reformulation Methods Based on Ensemble Prompts**: - **GenQREnsemble**: Generate multiple sets of keywords by using multiple different prompts (paraphrases of a zero - shot instruction), and then merge these keywords into the original query to improve retrieval performance. - **GenQRFusion**: After generating multiple sets of keywords, each set of keywords is combined with the original query respectively to generate multiple reformulated queries, and then the results of these queries are fused to form the final retrieval result. 2. **Introducing Feedback Mechanisms**: - **GenQREnsemble - RF** and **GenQRFusion - RF**: After retrieval, further optimize the query reformulation effect by combining relevant feedback from different sources (such as an oracle simulating a human user and a "critic" LLM). 3. **Experimental Verification**: - Experiments were carried out on multiple standard information retrieval benchmarks, and the results show that the proposed ensemble methods are significantly superior to the existing state - of - the - art methods in both pre - retrieval and post - retrieval settings, especially on the nDCG@10 metric, with a maximum improvement of 18%. ### Experimental Results and Analysis - **Pre - retrieval Performance**: - Compared with the original query, GenQREnsemble and GenQRFusion perform better in all four benchmark tests, especially on the TP19 benchmark, with nDCG@10 and MAP increased by 18% and 24% respectively. - By increasing the number of instructions, the performance gradually improves, indicating the effectiveness of the ensemble method. - **Post - retrieval Performance**: - GenQREnsemble - RF further improves the retrieval performance after combining relevant feedback, especially on the TP19 benchmark, with the nDCG@10 metric 9% higher than other Pseudo - Relevant Feedback (PRF) methods. - **Impact of Domain - Specific Instructions**: - Using domain - specific instructions can further improve the retrieval effect, especially in specific tasks such as DBPedia entity retrieval. ### Conclusion The paper effectively solves the vocabulary mismatch problem in user queries by proposing query reformulation methods based on ensemble prompts, and significantly improves the retrieval performance. These methods not only perform well in the pre - retrieval stage, but also further improve the retrieval effect after combining relevant feedback, providing a new direction for future query reformulation research.