Capturing Long-distance Dependencies in Sequence Models: A Case Study of Chinese Part-of-speech Tagging.

Weiwei Sun,Xiaochang Peng,Xiaojun Wan
2013-01-01
Abstract:This paper is concerned with capturing long-distance dependencies in sequence models. We propose a two-step strategy. First, the stacked learning technique is applied to integrate sequence models that are good at exploring local information and other high complexity models that are good at capturing long-distance dependencies. Second, the structure compilation technique is employed to transfer the predictive power of hybrid models to sequence models via large-scale unlabeled data. To investigate the feasibility of our idea, we study Chinese POS tagging. Experiments on the Chinese Treebank data demonstrate the effectiveness of our methods. The re-compiled models not only achieve high accuracy with respect to per token classification, but also serve as a front-end to a parser well.
What problem does this paper attempt to address?