Learning Domain Specific Sub-layer Latent Variable for Multi-Domain Adaptation Neural Machine Translation
Shuanghong Huang,Chong Feng,Ge Shi,Zhengjun Li,Xuan Zhao,Xinyan Li,Xiaomei Wang
DOI: https://doi.org/10.1145/3661305
IF: 1.471
2024-04-29
ACM Transactions on Asian and Low-Resource Language Information Processing
Abstract:Domain adaptation proves to be an effective solution for addressing inadequate translation performance within specific domains. However, the straightforward approach of mixing data from multiple domains to obtain the multi-domain neural machine translation (NMT) model can give rise to the parameter interference between domains problem, resulting in a degradation of overall performance. To address this, we introduce a multi-domain adaptive NMT method aimed at learning domain specific sub-layer latent variable and employ the Gumbel-Softmax reparameterization technique to concurrently train both model parameters and domain specific sub-layer latent variable. This approach facilitates the learning of private domain-specific knowledge while sharing common domain-invariant knowledge, effectively mitigating the parameter interference problem. The experimental results show that our proposed method significantly improved by up to 7.68 and 3.71 BLEU compared with the baseline model in English-German and Chinese-English public multi-domain datasets, respectively.
computer science, artificial intelligence