Lemmatization of Uyghur Inflectional Words

Mairehaba·aili,JIANG Wenbin,Tuergen·yibulayin
DOI: https://doi.org/10.3969/j.issn.1003-0077.2012.01.013
2012-01-01
Abstract:We propose an automatic lemmatization model for inflectional phenomenon.In contrast to previous methods,we generalize the inflection in conceptually,and treat the lemmatization with the sequence tagging models,.Using the Uyghur million word Part-of-Speech tagging corpus as the training data,the proposed method improves the F value of lemmatization up to 91.4% from 84.1%,especially attaining an F value of 88.6% for verbs which are rich in suffixes and complex.
What problem does this paper attempt to address?