Unsupervised Domain Adaptation for Joint Segmentation and POS-Tagging.

Yang Liu,Yue Zhang
2012-01-01
Abstract:We report an empirical investigation on type-supervised domain adaptation for joint Chinese word segmentation and POS-tagging, making use of domainspecific tag dictionaries and only unlabeled target domain data to improve target-domain accuracies, given a set of annotated source domain sentences. Previous work on POS-tagging of other languages showed that type-supervision can be a competitive alternative to tokensupervision, while semi-supervised techniques such as label propagation are important to the effectiveness of typesupervision. We report similar findings using a novel approach for joint Chinese segmentation and POS-tagging, under a cross-domain setting. With the help of unlabeled sentences and a lexicon of 3,000 words, we obtain 33% error reduction in target-domain tagging. In addition, combined type- and token-supervision can lead to improved cost-effectiveness.
What problem does this paper attempt to address?