Query Expansion Using A Collection Dependent Probabilistic Latent Semantic Thesaurus

Laurence A. F. Park,Kotagiri Ramamohanarao
DOI: https://doi.org/10.1007/978-3-540-71701-0_24
2007-01-01
Abstract:Many queries on collections of text documents are too short to produce informative results. Automatic query expansion is a method of adding terms to the query without interaction from the user in order to obtain more refined results. In this investigation, we examine our novel automatic query expansion method using the probabilistic latent semantic thesaurus, which is based on probabilistic latent semantic analysis. We show how to construct the thesaurus by mining text documents for probabilistic term relationships, and we show that by using the latent semantic thesaurus, we can overcome many of the problems associated to latent semantic analysis on large document sets which were previously identified. Experiments using TREC document sets show that our term expansion method out performs the popular probabilistic pseudo-relevance feedback method by 7.3%.
What problem does this paper attempt to address?