A Hybrid Feature Set based Maximum Entropy Hindi Named Entity Recognition

S. Saha,S. Sarkar,Pabitra Mitra
Abstract:We describe our effort in developing a Named Entity Recognition (NER) system for Hindi using Maximum Entropy (MaxEnt) approach. We developed a NER annotated corpora for the purpose. We have tried to identify the most relevant features for Hindi NER task to enable us to develop an efficient NER from the limited corpora developed. Apart from the orthographic and collocation features, we have experimented on the efficiency of using gazetteer lists as features. We also worked on semi-automatic induction of context patterns and experimented with using these as features of the MaxEnt method. We have evaluated the performance of the system against a blind test set having 4 classes Person, Organization, Location and Date. Our system achieved a f-value of 81.52%.
What problem does this paper attempt to address?