Improving Back-Translation with Uncertainty-based Confidence Estimation

Shuo Wang,Yang Liu,Chao Wang,Huanbo Luan,Maosong Sun
DOI: https://doi.org/10.48550/arXiv.1909.00157
2019-08-31
Abstract:While back-translation is simple and effective in exploiting abundant monolingual corpora to improve low-resource neural machine translation (NMT), the synthetic bilingual corpora generated by NMT models trained on limited authentic bilingual data are inevitably noisy. In this work, we propose to quantify the confidence of NMT model predictions based on model uncertainty. With word- and sentence-level confidence measures based on uncertainty, it is possible for back-translation to better cope with noise in synthetic bilingual corpora. Experiments on Chinese-English and English-German translation tasks show that uncertainty-based confidence estimation significantly improves the performance of back-translation.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the noise problem encountered when using rich monolingual corpora to improve translation performance through back - translation in low - resource neural machine translation (NMT). Specifically, the synthetic bilingual corpora generated by NMT models trained with limited real bilingual data inevitably contain noise, and these noises will cause translation errors to propagate in subsequent steps, thus affecting the effect of back - translation. To meet this challenge, the authors propose a confidence estimation method based on model uncertainty to quantify the confidence of NMT model predictions, so as to better handle the noise in synthetic bilingual corpora. ### Main Contributions 1. **Uncertainty Quantification**: A confidence estimation method based on model uncertainty is proposed, which quantifies the confidence of model predictions by calculating the expectation and variance of word - level and sentence - level translation probabilities. 2. **Confidence - Aware Training**: Incorporate confidence information into the training process of the NMT model. By modifying the likelihood function and the attention mechanism, the model can use noisy data more effectively. 3. **Experimental Verification**: Experiments were carried out on Chinese - English and English - German translation tasks, and the results show that the proposed confidence estimation method significantly improves the performance of back - translation. ### Method Overview 1. **Uncertainty Calculation**: - Use the Monte Carlo Dropout method to sample from the NMT model and calculate the expectation and variance of word - level and sentence - level translation probabilities. - Quantify the model's uncertainty by calculating the variance of translation probabilities. 2. **Confidence Measurement**: - Four confidence measurement methods are proposed: Predicted Translation Probability (PTP), Expected Translation Probability (EXP), Variance of Translation Probability (VAR), and a combination of expectation and variance (CEV). - Experimental results show that the CEV method, which combines expectation and variance, has the best effect. 3. **Confidence - Aware Training**: - **Sentence - Level Confidence**: Use sentence - level confidence as a weight to modify the likelihood function in the back - translation process and reduce the negative impact of low - confidence sentences on parameter estimation. - **Word - Level Confidence**: Construct a word - level confidence vector and modify the attention mechanism to make the model pay more attention to high - confidence words. ### Experimental Results - **Chinese - English Task**: On multiple test sets, the method using uncertainty confidence estimation is significantly superior to using only real bilingual corpora and traditional back - translation methods. - **English - German Task**: The experimental results show that the uncertainty confidence estimation method is not only superior to traditional back - translation methods, but even superior to the Neural Quality Evaluation method (N EURAL QE) that requires additional annotated data. ### Conclusion By introducing a confidence estimation method based on model uncertainty, this paper successfully solves the problems caused by the noise in synthetic bilingual corpora during the back - translation process and significantly improves the performance of low - resource neural machine translation.