Concept Based Query and Document Expansion Using Hidden Markov Model

Jiuling Zhang,Zuoda Liu,Beixing Deng,Xing Li
DOI: https://doi.org/10.5220/0001842506880691
2009-01-01
Abstract:Query and document expansion techniques have been widely studied for improving the effectiveness of information retrieval. In this paper, we propose a method for concept based query and document expansion employing the hidden Markov model(HMM). WordNet is adopted as the thesaurus set of concepts and terms. Expanded query and document candidates are yielded basing on the concepts which are recovered from the original query/document term sequence by employing the hidden Markov model. Using 50000 web pages crawled from universities as our test collection and Lemur Toolkit as our retrieval tool, preliminary experiment on query expansion show that the score of top 20 retrieved documents have a 2.7113 average score increment. Numbers of documents with score higher than a given value also increased significantly.
What problem does this paper attempt to address?