SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control

Xiaochuang Han,Sachin Kumar,Yulia Tsvetkov
DOI: https://doi.org/10.48550/arXiv.2210.17432
2022-10-31
Computation and Language
Abstract:Despite the growing success of diffusion models in continuous-valued domains (e.g., images), diffusion-based language models on discrete text have yet to match autoregressive language models on text generation benchmarks. In this work, we present SSD-LM -- a diffusion language model with two key design choices. First, SSD-LM is semi-autoregressive, iteratively generating blocks of text, allowing for flexible output length at decoding time while enabling local bidirectional context updates. Second, it is simplex-based, performing diffusion on the natural vocabulary space rather than a learned latent space, allowing us to incorporate classifier guidance and modular control without any adaptation of off-the-shelf classifiers. We evaluate SSD-LM on unconstrained as well as controlled text generation benchmarks, and show that it matches or outperforms strong autoregressive GPT-2 baselines across standard quality and diversity metrics.
What problem does this paper attempt to address?