A Tree-to-tree Alignment-Based Model for Statistical Machine Translation

Min Zhang,Hongfei Jiang,Ai Ti Aw,Jun Sun,Sheng Li,Chew Lim Tan
2007-01-01
Abstract:This paper presents a novel statistical machine translation (SMT) model that uses tree-to-tree alignment between a source parse tree and a target parse tree. The model is formally a probabilistic synchronous tree-substitution grammar (STSG) that is a collection of aligned elementary tree pairs with mapping probabilities (which are automatically learned from word-aligned bi-parsed parallel texts). Unlike previous syntax-based SMT models, this new model supports multi-level global structure distortion of the tree typology and can fully utilize the source and target parse tree structure features, which gives our system more expressive power and flexibility. The experimental results on the HIT bi-parsed text show that our method performs significantly better than Pharaoh, a state-of-the-art phrase-based SMT system, and other syntax-based methods, such as the synchronous CFG-based method on the small dataset.
What problem does this paper attempt to address?