Document-level paraphrase generation base on attention enhanced graph LSTM

Dong Qiu,Lei Chen,Yang Yu
DOI: https://doi.org/10.1007/s10489-022-04031-z
IF: 5.3
2022-08-19
Applied Intelligence
Abstract:Paraphrase generation is one of the long-standing and important tasks in natural language processing. Existing literature has mainly focused on the generation of sentence-level paraphrases, in which the relationship between sentences was ignored, such as sentence reordering, sentence splitting, and sentence merging. In this paper, while paying attention to the relationship within sentences, we also explore the relationship between sentences. For the task of document-level interpretation generation, we focus on reordering documents to enhance inter-sentence diversity. We use the attention-enhanced graph long short-term memory (LSTM) to encode the relationship graph between sentences, so that each sentence generates a coherent representation that conforms to the context. Based on the sentence-level paraphrase generation model, we constructed a pseudo-document-level paraphrase dataset. The automatic evaluation shows that our model achieves higher scores in terms of semantic relevance and diversity scores than other strong baseline models. In the manual evaluation, the validity of our model is also confirmed. Experiments show that our model retains the semantics of the source document, while generating paraphrase documents with high diversity. When we reorder the sentences, the output paraphrase documents can still preserve the coherence between sentences with higher scores.
computer science, artificial intelligence
What problem does this paper attempt to address?