Automatic and Efficient Recognition of Proper Nouns Based on Maximum Entropy Model

L Peng,Y Guan,XL Wang,J Sun
DOI: https://doi.org/10.1109/icmlc.2005.1527597
2005-01-01
Abstract:This paper presents a high performance method to identify English proper nouns (PNs) based on maximum entropy model (MaxEnt). Most traditional PNs recognition systems use lexical resources such as name list, as new names are constantly coming into existence, these are necessarily incomplete. Therefore machine learning methods are used to identify PNs automatically. In the framework of MaxEnt model, semantic and lexical information of surrounding words and word itself acting as atomic features comprises feature templates and forms feature without requiring extra expert knowledge. The test on WSJ of Penn Treebank II shows that this method guarantees high precision and recall, and at the same time it can reduce the quantity of features dramatically, downsize system space consumption, and decrease the time of training and testing, so as to improve the efficiency considerably. The method in this paper can be transformed to identify other specific noun easily because the principle of methods is universal.
What problem does this paper attempt to address?