Neural Domain Adaptation with Contextualized Character Embedding for Chinese Word Segmentation.

Zuyi Bao,Si Li,Sheng Gao,Weiran Xu
DOI: https://doi.org/10.1007/978-3-319-73618-1_35
2017-01-01
Abstract:There has a large scale annotated newswire data for Chinese word segmentation. However, some research proves that the performance of the segmenter has significant decrease when applying the model trained on the newswire to other domain, such as patent and literature. The same character appeared in different words may be in different position and with different meaning. In this paper, we introduce contextualized character embedding to neural domain adaptation for Chinese word segmentation. The contextualized character embedding aims to capture the useful dimension in embedding for target domain. The experiment results show that the proposed method achieves competitive performance with previous Chinese word segmentation domain adaptation methods.
What problem does this paper attempt to address?