Abstract:The past several years have witnessed rapid advances in syntax-based machine translation, which exploits natural language syntax to guide translation. Depending on the type of input, most of these efforts can be divided into two broad categories: (a) string-based systems whose input is a string, which is simultaneously parsed and translated by a synchronous grammar (Wu, 1997; Chiang, 2005; Galley et al., 2006), and (b) tree-based systems whose input is already a parse tree to be directly converted into a target tree or string (Lin, 2004; Ding and Palmer, 2005; Quirk et al., 2005; Liu et al., 2006; Huang et al., 2006). Compared with their string-based counterparts, tree-based systems offer many attractive features: they are much faster in decoding (linear time vs. cubic time), do not require sophisticated binarization (Zhang et al., 2006), and can use separate grammars for parsing and translation (e.g. a context-free grammar for the former and a tree substitution grammar for the latter). However, despite these advantages, most treebased systems suffer from a major drawback: they only use 1-best parse trees to direct translation, which potentially introduces translation mistakes due to parsing errors (Quirk and Corston-Oliver, 2006). This situation becomes worse for resourcepoor source languages without enough Treebank data to train a high-accuracy parser. This problem can be alleviated elegantly by using packed forests (Huang, 2008), which encodes exponentially many parse trees in a polynomial space. Forest-based systems (Mi et al., 2008; Mi and Huang, 2008) thus take a packed forest instead of a parse tree as an input. In addition, packed forests could also be used for translation rule extraction, which helps alleviate the propagation of parsing errors into rule set. Forest-based translation can be regarded as a compromise between the string-based and tree-based methods, while combining the advantages of both: decoding is still fast, yet does not commit to a single parse. Surprisingly, translating a forest of millions of trees is even faster than translating 30 individual trees, and offers significantly better translation quality. This approach has since become a popular topic.

Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training.

Tree-Based and Forest-Based Translation

Sub-Sentence Division for Tree-Based Machine Translation.

Better Simultaneous Translation with Monotonic Knowledge Distillation.

A Breadth-First Representation for Tree Matching in Large Scale Forest-Based Translation.

Dependency forest for statistical machine translation

Discriminative Training of 150 Million Translation Parameters and Its Application to Pruning.

Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding.

Improving Tree-to-Tree Translation with Packed Forests.

A simple discriminative training method for machine translation with large-scale features

A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Speech Translation

FOLSOM: A FAST AND MEMORY-EFFICIENT PHRASE-BASED APPROACH TO STATISTICAL MACHINE TRANSLATION

On Efficient Coupling of ASR and SMT for Speech Translation

Anticipating Future with Large Language Model for Simultaneous Machine Translation

Seal: Efficient Training Large Scale Statistical Machine Translation Models on Spark

Forest-to-String Statistical Translation Rules

Train Once, and Decode As You Like.

Lattice-based System Combination for Statistical Machine Translation.

TranSFormer: Slow-Fast Transformer for Machine Translation.

Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection

Enriching SMT Training Data Via Paraphrasing.