Exploiting Wikipedia Priori Knowledge for Chinese Named Entity Recognition

Jianfeng Li,Conghui Zhu,Sheng Li,Tiejun Zhao,Dequan Zheng
DOI: https://doi.org/10.1109/fskd.2016.7603406
2016-01-01
Abstract:Information Extraction is an important task in Natural Language Processing research. Named Entity Recognition as one of the basic tasks of information extraction, the effect has a great impact on the subsequent tasks such as Relation Extraction. And a major difficulty of NER lies in the unknown word identification. For this issue, method of exploiting Wikipedia external information methods was studied. Wikipedia is a rapid developing online encyclopedia in recent years. In 2016, the number of Chinese entries has reached 860,000. Huge valuable information will be provided to identify unknown words by Exploiting Wikipedia as external knowledge. The Wikipedia entries have been selected, and combined into the Conditional Random Field model of NER as features. The experimental studies demonstrate that this method can improve the effectiveness of NER significantly.
What problem does this paper attempt to address?