Improving Feature Extraction in Named Entity Recognition Based on Maximum Entropy Model

Wei Jiang,Yi Guan,Xiaolong Wang
DOI: https://doi.org/10.1109/ICMLC.2006.258916
2006-01-01
Abstract:A new method of improving feature extraction for named entity recognition is proposed in this paper. First of all, the context features and the entity features are extracted by the corresponding algorithm. The triggers extracted by mutual information, information gain, average mutual information etc, are adopted to enhance the context features. And rough set theory is used to extract the entity features. Secondly, word cluster method is presented to improve the approach of expanding features, which make us select features more easily, and overcome the sparse data problem effectively. Finally, all the features are added into the maximum entropy model. The experiments have confirmed that our method is effective. The above method has been used in our word segmenter, which participated in the International SIGHAN-2005 Evaluation, and ranked first in open test in MSR corpus
What problem does this paper attempt to address?