Computing Lexical Semantic Relatedness with Chinese Wikipedia

Fuqiang WAN,Yunfang WU
DOI: https://doi.org/10.3969/j.issn.1003-0077.2013.06.005
2013-01-01
Abstract:Fuqiang Wan, Yunfang Wu (Key Laboratory of computational linguistics (Peking University), Ministry of Education, Beijing, 100871) Abstract: Lexical semantic relatedness plays an important role in natural language processing, such as information retrieval, word sense disambiguation and automatic text summarization and spelling correction, etc. In this paper, we employ Wikipedia-based Explicit Semantic Analysis to compute semantic relatedness between Chinese words. Based on Chinese Wikipedia, a word is represented as weighted vectors of concepts. Then, computing the semantic relatedness of words amounts to comparing the corresponding concept vectors. Furthermore, we add the priori probability factor of concept and use the linking information among the Wikipedia pages to optimize the concept vectors. The experimental results show that the Spearman’s rank correlation coefficient between the computed relatedness and human judgments reaches 0.52, which significantly outperforms the baseline.
What problem does this paper attempt to address?