BERT-QE: Contextualized Query Expansion for Document Re-ranking

Zhi Zheng,Kai Hui,Ben He,Xianpei Han,Le Sun,Andrew Yates
DOI: https://doi.org/10.48550/arXiv.2009.07258
2020-11-04
Abstract:Query expansion aims to mitigate the mismatch between the language used in a query and in a document. However, query expansion methods can suffer from introducing non-relevant information when expanding the query. To bridge this gap, inspired by recent advances in applying contextualized models like BERT to the document retrieval task, this paper proposes a novel query expansion model that leverages the strength of the BERT model to select relevant document chunks for expansion. In evaluation on the standard TREC Robust04 and GOV2 test collections, the proposed BERT-QE model significantly outperforms BERT-Large models.
Information Retrieval,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the language mismatch between queries and documents in information retrieval. Specifically, there are differences in vocabulary, formality and even format between queries and documents (for example, keywords are used in queries while natural language is used in Wikipedia articles). To reduce this gap, different query expansion methods have been proposed to improve the effectiveness of document ranking. However, these query expansion methods may have problems when introducing irrelevant information, which will contaminate the expanded query results. Therefore, this paper proposes a new BERT - based query expansion model (BERT - QE), aiming to select relevant document fragments for expansion by taking advantage of the BERT model, so as to make more effective use of relevant information in pseudo - relevant feedback and improve the quality and effectiveness of query expansion.