Paragraph-Level Hierarchical Neural Machine Translation

Yuqi Zhang,Kui Meng,Gongshen Liu
DOI: https://doi.org/10.1007/978-3-030-36718-3_28
2019-01-01
Abstract:Neural Machine Translation (NMT) has achieved great developments in recent years, but we still have to face two challenges: establishing a high-quality corpus and exploring optimal parameters of models for long text translation. In this paper, we first attempt to set up a paragraph-parallel corpus based on English and Chinese versions of the novels and then design a hierarchical model for it to handle these two challenges. Our encoder and decoder take all the sentences of a paragraph as input to process the words, sentences, paragraphs at different levels, particularly with a two-layer transformer. The bottom transformer of encoder and decoder is used as another level of abstraction, conditioning on its own previous hidden states. Experimental results show that our hierarchical model significantly outperforms seven competitive baselines, including ensembles.
What problem does this paper attempt to address?