Chinese Word Similarity Computing Based on Combination Strategy

Shaoru Guo, Yong Guan, Ru Li, Qi Zhang
DOI: https://doi.org/10.1007/978-3-319-50496-4_67
2016-01-01
Abstract:Chinese word similarity computing is a fundamental task for natural language processing. This paper presents a method to calculate the similarity between Chinese words based on combination strategy. We apply Baidubaike to train Word2Vector model, and then integrate different methods, semantic Dictionary-based method, Word2Vector-based method and Chinese FrameNet (CFN)-based method, to calculate the semantic similarity between Chinese words. The semantic Dictionary-based method includes dictionaries such as HowNet, DaCilin, Tongyici Cilin (Extended) and Antonym. The experiments are performed on 500 pairs of words and the Spearman correlation coefficient of test data is 0.524, which shows that the proposed method is feasible and effective.
What problem does this paper attempt to address?