Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text.

Junjie Xing,Kenny Zhu,Shaodian Zhang
2018-01-01
Abstract:Chinese word segmentation (CWS) trained from open source corpus faces dramatic performance drop when dealing with domain text, especially for a domain with lots of special terms and diverse writing styles, such as the biomedical domain. However, building domain-specific CWS requires extremely high annotation cost. In this paper, we propose an approach by exploiting domain-invariant knowledge from high resource to low resource domains. Extensive experiments show that our model achieves consistently higher accuracy than the single-task CWS and other transfer learning baselines, especially when there is a large disparity between source and target domains.
What problem does this paper attempt to address?