Word Similarity Computation with Extreme-Similar Method.

Peiwen Du,Siding Chen,Xiaofei Xu,Li
DOI: https://doi.org/10.1007/978-3-319-69781-9_6
2017-01-01
Abstract:Chinese word similarity calculation is a key technique in Chinese information processing. The most widely used word-based similarity calculations often fail to detect subtle differences between two words. This can lead to grossing mis-estimation of the similarity between two words. In this paper, we propose a new method to calculate the similarity between two Chinese words with a particular focus on comparing pairs of words which are very similar in meaning. A hybrid combination strategy is formulated incorporating other similarity calculations for scenarios between these two extreme conditions. Different corpora and models are used to train the proposed method, then combining with the score obtained from the Hownet and the final similarity value is refined accordingly. This model makes an important improvement to the existing strategies. Experiments on very similar words were conducted with two evaluation metrics, the Spearman and Pearson rank correlation coefficients. Our final results are 0.427/0.421 which outperforms the existing state-of-the-art models. It clearly shows the effectiveness of the proposed method.
What problem does this paper attempt to address?