An algorithm of document refinement based on sentence similarity computation

Ting Ma,Daling Wang,Ge Yu,Baoshun Hu,DongLing Chen
2007-01-01
Journal of Computational Information Systems
Abstract:For the multi expressions of the same concept, the word mismatch problem between user queries and documents is one of the main factors that affect the retrieval performance. Query expansion is a common method to resolve the problem, but doesn't work well for general queries. Document refinement is a more effective method for this type of queries, which reconstructs documents based on query semantic instead of adding new words to queries. In this paper, we adopt this idea and propose a sentence-based similarity computation method for document refinement. In contrast with traditional TF-IDF model, our method considers some additional structural patterns of query words in sentences. The experiments indicate that combining document refinement with sentence-based similarity computation can significantly improve the retrieval precision and optimize the results ranking.
What problem does this paper attempt to address?