Abstract:Neural machine translation (NMT), a new approach to machine translation, has achieved promising results comparable to those of traditional approaches such as statistical machine translation (SMT). Despite its recent success, NMT cannot handle a larger vocabulary because the training complexity and decoding complexity proportionally increase with the number of target words. This problem becomes even more serious when translating patent documents, which contain many technical terms that are observed infrequently. In this paper, we propose to select phrases that contain out-of-vocabulary words using the statistical approach of branching entropy. This allows the proposed NMT system to be applied to a translation task of any language pair without any language-specific knowledge about technical term identification. The selected phrases are then replaced with tokens during training and post-translated by the phrase translation table of SMT. Evaluation on Japanese-to-Chinese, Chinese-to-Japanese, Japanese-to-English and English-to-Japanese patent sentence translation proved the effectiveness of phrases selected with branching entropy, where the proposed NMT model achieves a substantial improvement over a baseline NMT model without our proposed technique. Moreover, the number of translation errors of under-translation by the baseline NMT model without our proposed technique reduces to around half by the proposed NMT model.

Vocabulary Selection Strategies for Neural Machine Translation

On Using Very Large Target Vocabulary for Neural Machine Translation

Vocabulary Manipulation for Neural Machine Translation

Vocabulary Learning Via Optimal Transport for Neural Machine Translation

Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models

From Fully Trained to Fully Random Embeddings: Improving Neural Machine Translation with Compact Word Embedding Tables

A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation

Neural Machine Translation with Word Predictions.

Neural Machine Translation Model with a Large Vocabulary Selected by Branching Entropy

Embedding Word Similarity with Neural Machine Translation

An Efficient Character-Level Neural Machine Translation.

Neural Machine Translation with Extended Context

Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

Context-Dependent Translation Selection Using Convolutional Neural Network

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Finding the Optimal Vocabulary Size for Neural Machine Translation

Finding Better Subword Segmentation for Neural Machine Translation

Selective Attention for Context-aware Neural Machine Translation

Massive Exploration of Neural Machine Translation Architectures

Optimizing Segmentation Granularity for Neural Machine Translation