A Concept Language Model for Ad-hoc Retrieval

Bin Zou,Vasileios Lampos,Shangsong Liang,Zhaochun Ren,Emine Yilmaz,Ingemar J. Cox
DOI: https://doi.org/10.1145/3041021.3054209
2017-01-01
Abstract:We propose an extension to language models for information retrieval. Typically, language models estimate the probability of a document generating the query, where the query is considered as a set of independent search terms. We extend this approach by considering the concepts implied by both the query and words in the document. The model combines the probability of the document generating the concept embodied by the query, and the traditional language model probability of the document generating the query terms. We use a word embedding space to express concepts. The similarity between two vectors in this space is estimated using a weighted cosine distance. The weighting significantly enhances the discrimination between vectors. We evaluate our model on benchmark datasets (TREC 6--8) and empirically demonstrate it outperforms state-of-the-art baselines.
What problem does this paper attempt to address?