LM Enhanced BiRNN-CRF for Joint Chinese Word Segmentation and POS Tagging.

Jianhu Zhang,Gongshen Liu,Jie Zhou,Cheng Zhou,Huanrong Sun
DOI: https://doi.org/10.1007/978-3-319-99501-4_9
2018-01-01
Abstract:Word segmentation and part-of-speech tagging are two preliminary but fundamental components of Chinese natural language processing. With the upsurge of deep learning, end-to-end models are built without handcrafted features. In this work, we model Chinese word segmentation and part-of-speech tagging jointly on the basis of state-of-the-art BiRNN-CRF architecture. LSTM is adopted as the basic recurrent unit. Apart from utilizing pre-trained character embeddings and trigram features, we incorporate neural language model and conduct multi-task training. Highway layers are applied to tackle the discordance issue of the naive co-training. Experimental results on CTB5, CTB7, and PPD datasets show the effectiveness of the proposed method.
What problem does this paper attempt to address?