Syntax-Aware Complex-Valued Neural Machine Translation

Yang Liu,Yuexian Hou
2024-06-17
Abstract:Syntax has been proven to be remarkably effective in neural machine translation (NMT). Previous models obtained syntax information from syntactic parsing tools and integrated it into NMT models to improve translation performance. In this work, we propose a method to incorporate syntax information into a complex-valued Encoder-Decoder architecture. The proposed model jointly learns word-level and syntax-level attention scores from the source side to the target side using an attention mechanism. Importantly, it is not dependent on specific network architectures and can be directly integrated into any existing sequence-to-sequence (Seq2Seq) framework. The experimental results demonstrate that the proposed method can bring significant improvements in BLEU scores on two datasets. In particular, the proposed method achieves a greater improvement in BLEU scores in translation tasks involving language pairs with significant syntactic differences.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the issue of how to effectively utilize syntactic information from both the source and target languages to improve translation performance in Neural Machine Translation (NMT). Specifically, the paper proposes a method called Syntax-Aware Complex-Valued Neural Machine Translation (SynCoNMT). This method represents words and their syntactic information through Complex-Valued Neural Networks (CVNNs) and introduces a complex-valued attention mechanism in the encoder-decoder architecture. This approach considers the syntactic structures of both the source and target languages, enabling the model to better match syntactic dependencies between different languages. Moreover, this method is not dependent on a specific network architecture and can be applied to existing sequence-to-sequence (Seq2Seq) frameworks. Experimental results show that SynCoNMT achieves significant BLEU score improvements over baseline models and other syntax-enhanced NMT methods on two datasets, especially when dealing with language pairs with substantial grammatical differences. Additionally, SynCoNMT also demonstrates better performance in translating long sentences.