Towards More Diverse Input Representation for Neural Machine Translation

Kehai Chen,Rui Wang,Masao Utiyama,Eiichiro Sumita,Tiejun Zhao,Muyun Yang,Hai Zhao
DOI: https://doi.org/10.1109/taslp.2020.2996077
2020-01-01
Abstract:Source input information plays a very important role in the Transformer-based translation system. In practice, word embedding and positional embedding of each word are added as the input representation. Then self-attention networks are used to encode the global dependencies in the input representation to generate a source representation. However, this processing on the source representation only adopts a single source feature and excludes richer and more diverse features such as recurrence features, local features, and syntactic features, which results in tedious representation and thereby hinders the further translation performance improvement. In this paper, we introduce a simple and efficient method to encode more diverse source features into the input representation simultaneously, and thereby learning an effective source representation by self-attention networks. In particular, the proposed grouped strategy is only applied to the input representation layer, to keep the diversity of translation information and the efficiency of the self-attention networks at the same time. Experimental results show that our approach improves the translation performance over the state-of-the-art baselines of Transformer in regard to WMT14 English-to-German and NIST Chinese-to-English machine translation tasks.
What problem does this paper attempt to address?