Exploit Semantic Information For Category Annotation Recommendation In Wikipedia

Yang Wang,Haofen Wang,Haiping Zhu,Yong Yu
DOI: https://doi.org/10.1007/978-3-540-73351-5_5
2007-01-01
Abstract:Compared with plain-text resources, the ones in "semi-semantic" web sites, such as Wikipedia, contain high-level semantic information which will benefit various automatically annotating tasks on themself. In this paper, we propose a "collaborative annotating" approach to automatically recommend categories for a Wikipedia article by reusing category annotations from its most similar articles and ranking these annotations by their confidence. In this approach, four typical semantic features in Wikipedia, namely incoming link, outgoing link, section heading and template item, are investigated and exploited as the representation of articles to feed the similarity calculation. The experiment results have not only proven that these semantic features improve the performance of category annotating, with comparison to the plain text feature;, but also demonstrated the strength of our approach in discovering missing annotations and proper level ones for Wikipedia articles.
What problem does this paper attempt to address?