An Improved Hierarchical Phrase Based Machine Translation Model

Zhanyi Liu,Ting Liu,Sheng Li
2011-01-01
Abstract:Hierarchical phrase based translation model is state-of-the-art statistical machine translation (SMT) method with the advantage of both phrase-based SMT models and syntax-based MT models. However, this model is defected in neglecting the word correlations during the automatic hierarchical phrase extraction from word-aligned bilingual corpus: an idiomatic expression of an input sentence may be split into several fragments and thus prone to false translation concatenation. To address this issue, this paper proposes to integrate statistical collocation model to calculate collocation probabilities for hierarchical phrases to improve the qualities of the hierarchical phrases. The experimental results show that our methods effectively improve the performance of SMT systems. As compared with baseline systems, our methods achieve absolute improvements of 0.39 and 0.60 BLEU score on Chinese-to-English test sets (NIST-MT04 and NIST-MT08), respectively.
What problem does this paper attempt to address?