Feature-based Approaches to Semantic Similarity Assessment of Concepts Using Wikipedia

Yuncheng Jiang,Xiaopei Zhang,Yong Tang,Ruihua Nie
DOI: https://doi.org/10.1016/j.ipm.2015.01.001
IF: 7.466
2015-01-01
Information Processing & Management
Abstract:Semantic similarity assessment 'between concepts is an important task in many language related applications. In the past, several approaches to assess similarity by evaluating the knowledge modeled in an (or multiple) ontology (or ontologies) have been proposed. However, there are some limitations such as the facts of relying on predefined ontologies and fitting non-dynamic domains in the existing measures. Wilcipedia provides a very large domain-independent encyclopedic repository and semantic network for computing semantic similarity of concepts with more coverage than usual ontologies. In this paper, we propose some novel feature based similarity assessment methods that are fully dependent on Wikipedia and can avoid most of the limitations and drawbacks introduced above. To implement similarity assessment based on feature by making use of Wikipedia, firstly a formal representation of Wikipedia concepts is presented. We then give a framework for feature based similarity based on the formal representation of Wikipedia concepts. Lastly, we investigate several feature based approaches to semantic similarity measures resulting from instantiations of the framework. The evaluation, based on several widely used benchmarks and a benchmark developed in ourselves, sustains the intuitions with respect to human judgements. Overall, several methods proposed in this paper have good human correlation and constitute some effective ways of determining similarity between Wikipedia concepts. (C) 2015 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?