Abstract:In current statistical machine translation, IBM model based word alignment is widely used as a starting point to build phrase-based machine translation systems. However, such alignment model is separated from the rest of machine translation pipeline and optimized independently. Furthermore, structural information is not taken into account in the alignment model, which sometimes leads to incorrect alignments. In this paper, we present a novel method to connect a re-alignment model with a translation model in an integrated framework. We conduct bilingual chart parsing based on syntax-augmented synchronous context-free grammar. A Viterbi derivation tree is generated for each sentence pair with multiple features employed in a log-linear model. A new word alignment is created under the structural constraint from the Viterbi tree. Extensive experiments are conducted in a Farsi-to-English translation task in conversational speech domain and also a German-to-English translation task in text domain. Systems trained on the new alignment provide significant higher BLEU scores compared to a state-of-the-art baseline.

An Orientation Model for Hierarchical Phrase-Based Translation

Lexical Reordering for Hierarchical Phrase-based Translation

Head-modifier relation based non-lexical reordering model for phrase-based translation

Two-Neighbor Orientation Model with Cross-Boundary Global Contexts.

Tree-State Based Rule Selection Models for Hierarchical Phrase-Based Machine Translation.

Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions.

Word-Level Reordering Model for Phrase-Based SMT

Using Features from Topic Models to Alleviate Over-Generation in Hierarchical Phrase-Based Translation

Harmonizing Word Alignments and Syntactic Structures for Extracting Phrasal Translation Equivalents

Towards Integrated Machine Translation Using Structural Alignment From Syntax-Augmented Synchronous Parsing

Optimizing Word Alignment Combination for Phrase Table Training

Learning synchronous context-free grammars with multiple specialised non-terminals for hierarchical phrase-based translation

Head-Modifier Dependency Reordering Model for Phrased SMT

Graph-based Lexicalized Reordering Models for Statistical Machine Translation

Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding.

Building a Hierarchically Aligned Chinese-English Parallel Treebank.

Translation rules extraction for statistical machine translation

A Ranking-based Approach to Word Reordering for Statistical Machine Translation.

Prior Derivation Models For Formally Syntax-Based Translation Using Linguistically Syntactic Parsing and Tree Kernels.

Extracting Hierarchical Rules from a Weighted Alignment Matrix.

A Neural Reordering Model for Phrase-based Translation.