Long Text Generation with Topic-aware Discrete Latent Variable Model.

Erguang Yang,Mingtong Liu,Deyi Xiong,Yujie Zhang,Yufeng Chen,Jinan Xu
DOI: https://doi.org/10.18653/v1/2022.emnlp-main.554
2022-01-01
Abstract:Generating coherent long texts is an important yet challenging task, particularly for the openended generation task.Prior work based on discrete latent codes focuses on the modeling of discourse relation, resulting in discrete codes only learning shallow semantics (Ji and Huang, 2021).A natural text always revolves around several related topics and the transition across them is natural and smooth.In this work, we investigate whether discrete latent codes can learn information of topics.To this end, we build a topic-aware latent code-guided text generation model.To encourage discrete codes to model information about topics, we propose a span-level bag-of-words training objective for the model.Automatic and manual evaluation experiments show that our method can generate more topic-relevant and coherent texts.
What problem does this paper attempt to address?