Abstract:Diffusion Language models (DLMs) are a promising avenue for text generation due to their practical properties on tractable controllable generation. They also have the advantage of not having to predict text autoregressively. However, despite these notable features, DLMs have not yet reached the performance levels of their autoregressive counterparts. One of the ways to reduce the performance gap between these two types of language models is to speed up the generation of DLMs. Therefore, we propose a novel methodology to address this issue in this work. It enables the execution of more generation steps within a given time frame, leading to higher-quality outputs. Specifically, our methods estimate DLMs completeness of text generation and allow adaptive halting of the generation process. We evaluate our methods on Plaid, SSD, and CDCD DLMs and create a cohesive perspective on their generation workflows. Finally, we confirm that our methods allow halting these models and decrease the generation time by $10$-$40$\% without a drop in the quality of model samples.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to increase the generation speed of Diffusion Language Models (DLMs) while maintaining the quality of the generated text. Specifically, the author proposes a new method to dynamically terminate the generation process of DLMs, so as to perform more generation steps within a given time and improve the output quality. ### Problem Background Diffusion Language Models (DLMs) have many advantages in text generation, such as non - autoregressive prediction and controllable generation capabilities. However, compared with autoregressive language models, the performance of DLMs has not reached the same level. One of the key problems is that the generation speed of DLMs is slow. ### Solutions To solve this problem, the author proposes the following methods: 1. **Estimating the Completion Degree of DLM Generation**: By evaluating the completion degree during the DLM generation process, the generation can be terminated early at an appropriate time. 2. **Adaptive Halting**: Three adaptive halting criteria are introduced, which are based on entropy, patience, KL - divergence and other indicators respectively. 3. **Fixed Step Criterion**: As a control experiment, a method of fixed - step termination is set up. ### Experimental Results The author proves through experiments that these methods can significantly reduce the generation time without reducing the quality of the generated text. Specifically: - For the DDLM model, using the KL - divergence criterion can terminate about 600 steps in advance, 50 steps earlier than other criteria. - The SSD model benefits less, but can still save about 10 steps. - The Plaid model has a poor effect on the adaptive termination strategy, but can improve the computational efficiency through fixed - step termination. ### Main Contributions 1. **Applying Adaptive Early Exiting in DLMs for the First Time**: This is the first time that adaptive early exiting has been applied to DLMs and its effectiveness has been proven. 2. **Providing Multiple Adaptive Termination Criteria**: Including entropy, patience and KL - divergence criteria, and detailed comparative experiments have been carried out. 3. **Experimentally Proving the Effectiveness of Early Exiting**: Through multiple evaluation indicators (such as AR - NLL, diversity, etc.), it has been confirmed that early exiting does not affect the quality of the generated text. ### Conclusion This research not only improves the generation speed of DLMs, but also provides valuable tools and methods for further optimizing and improving the design of DLMs. In particular, by dynamically evaluating the generation process, the capabilities and limitations of the model can be better understood, thus promoting the wider application and development of DLMs in practical applications.

Diffusion Language Models Generation Can Be Halted Early

Multimodal Latent Language Modeling with Next-Token Diffusion

Energy-Based Diffusion Language Models for Text Generation

Scaling Diffusion Language Models via Adaptation from Autoregressive Models

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

Diffusion-LM Improves Controllable Text Generation

Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion

Think While You Generate: Discrete Diffusion with Planned Denoising

SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control

Diffusion Guided Language Modeling

PLM-Based Discrete Diffusion Language Models with Entropy-Adaptive Gibbs Sampling

Accelerating Diffusion Models via Early Stop of the Diffusion Process

Quantized Embedding Vectors for Controllable Diffusion Language Models

DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models

Promises, Outlooks and Challenges of Diffusion Language Modeling

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution

Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement

Diffusion Models already have a Semantic Latent Space

Utilizing Latent Diffusion Model to Accelerate Sampling Speed and Enhance Text Generation Quality