A Novel Linguistic Phenomenon Description for Text Similarity Computing

Dequan Zheng,Tiejun Zhao,Sheng Li,Muyun Yang
DOI: https://doi.org/10.1049/cp:20070753
2007-01-01
Abstract:A solution of computing text similarity was presented in this paper, which was based on a novel linguistic phenomenon description. In this study, word sense ontology of keyword is firstly constructed by context multi-information, and then, the same feature firstly was acquired from text pairs, the usage of context co-occurrence feature was gotten in using part of speech, semantic, location, average co-occurrence probability, and was expressed as the linguistic ontology knowledge; final, text similarity evaluation value is calculated for each text to judge the text similarity degree. The Chinese document set from the NTCIR-3 workshop collection was used to evaluate the method, it shows that an average 15.45%-18.49% and 11.96%-15.35% increase in precision can be achieved at top 10 and 100 ranking documents level respectively.
What problem does this paper attempt to address?