Exploiting Multiple Treebanks for Parsing with Quasi-Synchronous Grammars

Zhenghua Li,Ting Liu,Wanxiang Che
2012-01-01
Abstract:We present a simple and effective framework for exploiting multiple monolingual treebanks with different annotation guidelines for parsing. Several types of transformation patterns (TP) are designed to capture the systematic annotation inconsistencies among different tree-banks. Based on such TPs, we design quasi-synchronous grammar features to augment the baseline parsing models. Our approach can significantly advance the state-of-the-art parsing accuracy on two widely used target tree-banks (Penn Chinese Treebank 5.1 and 6.0) using the Chinese Dependency Treebank as the source treebank. The improvements are respectively 1.37% and 1.10% with automatic part-of-speech tags. Moreover, an indirect comparison indicates that our approach also outperforms previous work based on treebank conversion.
What problem does this paper attempt to address?