Part-of-Speech Tagging for Chinese-English Mixed Texts with Dynamic Features

Jiayi Zhao,Xipeng Qiu,Shu Zhang,Feng Ji,Xuanjing Huang
2012-01-01
Abstract:In modern Chinese articles or conversations, it is very popular to involve a few English words, especially in emails and Internet literature. Therefore, it becomes an important and challenging topic to analyze Chinese-English mixed texts. The underlying problem is how to tag part-of-speech (POS) for the English words involved. Due to the lack of specially annotated corpus, most of the English words are tagged as the oversimplified type, "foreign words". In this paper, we present a method using dynamic features to tag POS of mixed texts. Experiments show that our method achieves higher performance than traditional sequence labeling methods. Meanwhile, our method also boosts the performance of POS tagging for pure Chinese texts.
What problem does this paper attempt to address?