Sub-Sentence Division for Tree-Based Machine Translation.

Hao Xiong,Wenwen Xu,Haitao Mi,Yang Liu,Qun Liu
DOI: https://doi.org/10.3115/1667583.1667626
2009-01-01
Abstract:Tree-based statistical machine translation models have made significant progress in recent years, especially when replacing 1-best trees with packed forests. However, as the parsing accuracy usually goes down dramatically with the increase of sentence length, translating long sentences often takes long time and only produces degenerate translations. We propose a new method named sub-sentence division that reduces the decoding time and improves the translation quality for tree-based translation. Our approach divides long sentences into several sub-sentences by exploiting tree structures. Large-scale experiments on the NIST 2008 Chinese-to-English test set show that our approach achieves an absolute improvement of 1.1 BLEU points over the baseline system in 50% less time.
What problem does this paper attempt to address?