A Transition-based Model for Joint Segmentation, POS-tagging and Normalization

Tao Qian,Yue Zhang,Meishan Zhang,Yafeng Ren,Dong-Hong Ji
DOI: https://doi.org/10.18653/v1/d15-1211
2015-01-01
Abstract:We propose a transition-based model for joint word segmentation, POS tagging and text normalization. Different from previous methods, the model can be trained on standard text corpora, overcoming the lack of annotated microblog corpora. To evaluate our model, we develop an annotated corpus based on microblogs. Experimental results show that our joint model can help improve the performance of word segmentation on microblogs, giving an error reduction in segmentation accuracy of 12.02%, compared to the traditional approach.
What problem does this paper attempt to address?