Explore Chinese Encyclopedic Knowledge to Disambiguate Person Names.

Jie Liu,Ruifeng Xu,Qin Lu,Jian Xu
2012-01-01
Abstract:This paper presents the HITSZ-PolyU system in the CIPS-SIGHAN bakeoff 2012 Task 3, Chinese Personal Name Disambiguation. This system leveraged the Chinese encyclopedia Baidu Baike (Baike) as the external knowledge to disambiguate the person names. Three kinds of features are extracted from Baike. They are the entities’ texts in Baike, the entities’ work-of-art words and titles in the Baike. With these features, a Decision Tree (DT) based classifier is trained to link test names to nodes in the NameKB. Besides, the contextual information surrounding test names is used to verify whether test names are person name or not. Finally, a simple clustering approach is used to group NIL test names that have no links to the NameKB. Our proposed system attains 64.04% precision, 70.1% recall and 66.95% F-score.
What problem does this paper attempt to address?