A Chinese Part-of-speech Tagging Approach Using Conditional Random Fields

HONG Ming-Cai,ZHANG Kuo,TANG Jie,LI Juan-Zi
DOI: https://doi.org/10.3969/j.issn.1002-137X.2006.10.040
2006-01-01
Computer Science
Abstract:This paper presents a new approach to part-of-speech (POS) tagging for Chinese texts using conditional random fields (CRFs). To take advantage of the ability of using arbitrary features as input in CRFs, not only contexts of words are exploited, but also are new statistical features adopted for multiple-category and out-of-vocabulary words. Closed and open tests conducted on People Daily dataset obtain POS tagging accuracies of 98.56% and 96.60%, respectively.
What problem does this paper attempt to address?