Chinese Document Re-Ranking Based on Term Distribution and Maximal Marginal Relevance

Lingpeng Yang,Donghong Ji,Munkew Leong
DOI: https://doi.org/10.1007/11562382_23
2005-01-01
Abstract:In this paper, we propose a document re-ranking method for Chinese information retrieval where a query is a short natural language description. The method bases on term distribution where each term is weighted by its local and global distribution, including document frequency, document position and term length. The weight scheme lifts off the worry that very fewer relevant documents appear in top retrieved documents, and allows randomly setting a larger portion of the retrieved documents as relevance feedback. It also helps to improve the performance of MMR model in document re-ranking. The experiments show our method can get significant improvement against standard baselines, and outperforms relevant methods consistently.
What problem does this paper attempt to address?