Understanding the User: An Intent-Based Ranking Dataset

Abhijit Anand,Jurek Leonhardt,V Venktesh,Avishek Anand
2024-08-30
Abstract:As information retrieval systems continue to evolve, accurate evaluation and benchmarking of these systems become pivotal. Web search datasets, such as MS MARCO, primarily provide short keyword queries without accompanying intent or descriptions, posing a challenge in comprehending the underlying information need. This paper proposes an approach to augmenting such datasets to annotate informative query descriptions, with a focus on two prominent benchmark datasets: TREC-DL-21 and TREC-DL-22. Our methodology involves utilizing state-of-the-art LLMs to analyze and comprehend the implicit intent within individual queries from benchmark datasets. By extracting key semantic elements, we construct detailed and contextually rich descriptions for these queries. To validate the generated query descriptions, we employ crowdsourcing as a reliable means of obtaining diverse human perspectives on the accuracy and informativeness of the descriptions. This information can be used as an evaluation set for tasks such as ranking, query rewriting, or others.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the gap between user intent and machine understanding in information retrieval systems. Specifically: 1. **Aligning User Intent with Machine Intent**: - Current information retrieval systems often struggle to accurately understand the actual needs of users when processing their queries. This mismatch may stem from the complexity and variability in how users express their queries and how systems interpret and handle these needs. 2. **Limitations of Existing Datasets**: - The datasets currently used to train ranking models (such as MS MARCO) primarily provide short keyword queries without accompanying query intent or descriptions, making it difficult to understand the underlying information needs. To address these issues, the paper proposes a method to enhance existing datasets by annotating detailed query descriptions to better reflect user intent. Specifically, the authors utilize advanced large language models (LLMs) to analyze and understand the implicit intent within queries in benchmark datasets and extract key semantic elements to construct detailed and context-rich descriptions. To verify the accuracy of the generated query descriptions, the authors employ a crowdsourcing approach to gather diverse user feedback. This information can serve as evaluation sets for tasks such as ranking, query rewriting, etc. The main contribution of the paper is the introduction of a new dataset named DL-MIA, which contains 2655 (query, intent, passage, label) tuples derived from the TREC-DL 2021 and 2022 test sets. Through this dataset, researchers can more finely measure the gap between user intent and queries and apply it to various ranking scenarios, such as re-ranking, diversification, intent coverage, or query suggestion tasks.