Anticipation-free Training for Simultaneous Translation

Chih-Chiang Chang,Shun-Po Chuang,Hung-yi Lee
2022-01-01
Abstract:Simultaneous translation (SimulMT) speeds up 001 the translation process by starting to translate 002 before the source sentence is completely avail-003 able. It is difficult due to limited context and 004 word order difference between languages. Ex-005 isting methods increase latency or introduce 006 adaptive read-write policies for SimulMT mod-007 els to handle local reordering and improve 008 translation quality. However, the long-distance 009 reordering would make the SimulMT models 010 learn translation mistakenly. Specifically, the 011 model may be forced to predict target tokens 012 when the corresponding source tokens have not 013 been read. This leads to aggressive anticipation 014 during inference, resulting in the hallucination 015 phenomenon. To mitigate this problem, we 016 propose a new framework that decompose the 017 translation process into the monotonic trans-018 lation step and the reordering step, and we 019 model the latter by the auxiliary sorting net-020 work (ASN). The ASN rearranges the hidden 021 states to match the order in the target language, 022 so that the SimulMT model could learn to trans-023 late more reasonably. The entire model is opti-024 mized end-to-end and does not rely on external 025 aligners or data. During inference, ASN is re-026 moved to achieve streaming. Experiments show 027 the proposed framework could outperform pre-028 vious methods with less latency. 1 029
Computer Science
What problem does this paper attempt to address?