Passage-aware Search Result Diversification

Zhan Su,Zhicheng Dou,Yutao Zhu,Ji-Rong Wen
DOI: https://doi.org/10.1145/3653672
IF: 4.657
2024-03-21
ACM Transactions on Information Systems
Abstract:Research on search result diversification strives to enhance the variety of subtopics within the list of search results. Existing studies usually treat a document as a whole and represent it with one fixed-length vector. However, considering that a long document could cover different aspects of a query, using a single vector to represent the document is usually insufficient. To tackle this problem, we propose to exploit multiple passages to better represent documents in search result diversification. Different passages of each document may reflect different subtopics of the query and comparison among the passages can improve result diversity. Specifically, we segment the entire document into multiple passages and train a classifier to filter out the irrelevant ones. Then the document diversity is measured based on several passages that can offer the information needs of the query. Thereafter, we devise a passage-aware search result diversification framework that takes into account the topic information contained in the selected document sequence and candidate documents. The candidate documents’ novelty is evaluated based on their passages while considering the dynamically selected document sequence. We conducted experiments on a commonly utilized dataset, and the results indicate that our proposed method performs better than the most leading methods.
computer science, information systems
What problem does this paper attempt to address?