Capturing Semantic Similarity for Words in Wikipedia with Random Walk

Jianyong Duan,Jiayuan Cui,Mingli Wu,Hao Wang
DOI: https://doi.org/10.1109/ccis.2018.8691144
2018-01-01
Abstract:As the most comprehensive and most structured Encyclopedia knowledge base that human known currently, Wikipedia has provided a great deal of semantic knowledge to people. In order to fully excavate the semantic similarity of the keywords in Wikipedia, in this paper we propose a method based on similarity of words in random walk mode. Firstly, we construct the Wikipedia link graph based on Wikipedia's structured information, and then use random walk to fully excavate the semantic relevance of Wikipedia keywords. We propose T-truncated and ε-truncated pruning strategies for improving algorithm performance. The experimental results show that the spearman series with artificial labeling standard can reach 0.51, which significantly improves the calculation results of the word correlation degree.
What problem does this paper attempt to address?