A Latent Variable Model with Hierarchical Structure and GPT-2 for Long Text Generation

Kun Zhao,Hongwei Ding,Kai Ye,Xiaohui Cui,Zhongwang Fu
DOI: https://doi.org/10.1007/978-3-030-86383-8_24
2021-01-01
Abstract:Variational AutoEncoder (VAE) has made great achievements in the field of text generation. However, the current research mainly focuses on short texts, with little attention paid to long texts (more than 20 words). In this paper, we first propose a hidden-variable model based on the GPT-2 and hierarchical structure to generate long text. We use hierarchical GRU to encode long text to get hidden variables. At the same time, to generate the text better, we combine the hierarchical structure and GPT-2 in the decoder for the first time. Our model improves Perplexity (PPL), Kullback Leibler (KL) divergence, Bilingual Evaluation Understudy (BLEU) score, and Self-BLEU. The experiment indicates that the coherence and diversity of sentences generated by our model are better than the baseline model.
What problem does this paper attempt to address?