Document Relevance Calculation Based on Lexical Cohesion

赵玉茗,徐志明,王晓龙,朱鲲鹏
DOI: https://doi.org/10.3724/sp.j.1146.2007.00493
2011-01-01
Abstract:This paper explores the feasibility of constructing a document relevance calculating model based on lexical cohesion with structure analysis. In this model, by extracting the semanticrelative word clusters in documents according to the lexicon cohesion principle, documents are formalized in expressions which are composed of lexicon chains with structure information. And based on this kind of representation, document relevance calculation is substituted by semantic distance calculation of lexical chains. The feasibility of this novel approach has been examined by experiments conducted on Chinese Library Classification (CLC) dataset. The results show that the method makes good use of the background knowledge of ordinary users, and it is an effective method for relevance calculation of documents.
What problem does this paper attempt to address?