An Optimized Regularization Method to Enhance Low-Resource MT

Yatu Ji,Hongxu Hou,Ying Lei,Zhong Ren
DOI: https://doi.org/10.1007/978-981-13-5907-1_30
2019-01-01
Abstract:Overfitting caused by scarce parallel corpus is a serious problem in low-resource machine translation task, resulting in the weak generalization ability of translation models. Dropout and Dropconnect can address this issue by reducing training neurons or weights randomly with increasing the generalization ability. In this paper, we optimize Dropconnect by adopting Gaussian approximation in the Bernoulli distribution in low-resource machine translation tasks, and make an integration to alleviate the uneven sampling effect in Dropout and Dropconnect, especially the inadequate training problem. It is an effective approach to approximate mask calculations to linear operations while being fully trained. An interesting finding is that the adhesive language is more sensitive to our regular methods. Our approach outperforms the Dropout and Dropconnect for low-resource translation tasks.
What problem does this paper attempt to address?