Chinese Named Entity Recognition and Disambiguation Based on Multi-stage Clustering

LI Guangyi,WANG Houfeng
DOI: https://doi.org/10.3969/j.issn.1003-0077.2013.05.005
2013-01-01
Abstract:Named Entity Recognition and Disambiguation is an important research of Natural Language Understanding.For the task of Named Entity Recognition and Disambiguation in the situation of entity knowledge base provided,this paper presents a method based on multi-stage clustering.First,we link the document to the entity definition in the knowledge base by two rounds of clustering.Second,we group entities which don't exist in the knowledge base by Hierarchical Agglomerative Clustering.Finally,we recognize ordinary words and adjust the results by KMeans Clustering.Our experiments on data of CLP-2012Chinese person name disambiguation task proves our system performs well.The F score on test data is 86.68%,exceeding the best result of the Bake-off by 6.46%.
What problem does this paper attempt to address?