Identification of Location Names from Chinese Texts Based on Support Vector Machine

LI Li-shuang,HUANG De-gen,CHEN Chun-rong,YANG Yuan-sheng
DOI: https://doi.org/10.3321/j.issn:1000-8608.2007.03.025
2007-01-01
Abstract:Based on the characteristics of location names in Chinese texts,a method of automatic identification of Chinese location names using support vector machine(SVM) is proposed.The character itself,character-based part of speech(POS) tag,the information whether a character appears in a location name characteristic word table and context information are extracted as the features of the vectors.Each sample is represented by a long binary vector,and thus a training set is established.The machine learning models of automatic identification of location names are obtained by testing polynomial kernel functions.The results show that the models are efficient in identifying location names from Chinese texts.The recall,precision and F-measure are up to 86.69%,93.82% and 90.12% respectively in open test.
What problem does this paper attempt to address?