Abstract:Understanding representation transfer in multilingual neural machine translation can reveal the representational issue causing the zero-shot translation deficiency. In this work, we introduce the identity pair, a sentence translated into itself, to address the lack of the base measure in multilingual investigations, as the identity pair represents the optimal state of representation among any language transfers. In our analysis, we demonstrate that the encoder transfers the source language to the representational subspace of the target language instead of the language-agnostic state. Thus, the zero-shot translation deficiency arises because representations are entangled with other languages and are not transferred effectively to the target language. Based on our findings, we propose two methods: 1) low-rank language-specific embedding at the encoder, and 2) language-specific contrastive learning of the representation at the decoder. The experimental results on Europarl-15, TED-19, and OPUS-100 datasets show that our methods substantially enhance the performance of zero-shot translations by improving language transfer capacity, thereby providing practical evidence to support our conclusions.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the poor performance of zero - shot translation in multilingual neural machine translation (MNMT). Specifically, the author explores the role of representation transfer in multilingual translation by introducing "identity pairs", that is, a sentence is translated into itself. The paper points out that when current MNMT models handle zero - shot translation, the representation of the source language fails to be effectively transferred to the representation space of the target language, but is entangled with the representations of other languages, resulting in unsatisfactory translation results.
### Main problems
1. **Insufficient zero - shot translation performance**:
- Zero - shot translation refers to translation when some language pairs have not been seen during the training process. Existing MNMT models perform poorly when handling zero - shot translation, mainly because the representation of the source language fails to be effectively transferred to the representation space of the target language.
2. **Effectiveness of representation transfer**:
- The author finds that when the encoder processes translation tasks, it will transfer the representation of the source language to the subspace of the target language, rather than a language - independent state. This representation entanglement in the transfer process leads to a decline in zero - shot translation performance.
### Solutions
To improve the performance of zero - shot translation, the author proposes two methods:
1. **Low - Rank Language - specific Embedding (LOLE)**:
- Apply LOLE on the encoder side. By introducing a learnable embedding vector, make the representation more biased towards the subspace of the target language. This helps to improve the transfer ability of the representation, thereby improving the effect of zero - shot translation.
2. **Language - specific Contrastive Learning of Representations (LCLR)**:
- Apply LCLR on the decoder side. Through contrastive learning, isolate the representation spaces of different languages. This helps to reduce representation entanglement and further improve translation performance.
### Experimental results
The author conducted experiments on three benchmark datasets, namely Europarl - 15, TED - 19 and OPUS - 100. The experimental results show that the proposed LOLE and LCLR methods significantly improve the performance of zero - shot translation, especially in improving the representation transfer ability.
### Conclusion
By introducing identity pairs and systematically analyzing representation transfer, the author proves that the main task of the encoder when handling multilingual translation is to transfer the representation of the source language to the subspace of the target language. The reason for the poor performance of zero - shot translation lies in representation entanglement, and the proposed LOLE and LCLR methods can effectively alleviate this problem, thereby improving the performance of zero - shot translation.