Abstract:We propose a Distributional Approach for addressing Controlled Text Generation from pre-trained Language Models (LMs). This approach permits to specify, in a single formal framework, both "pointwise" and "distributional" constraints over the target LM -- to our knowledge, the first model with such generality -- while minimizing KL divergence from the initial LM distribution. The optimal target distribution is then uniquely determined as an explicit EBM (Energy-Based Model) representation. From that optimal representation we then train a target controlled Autoregressive LM through an adaptive distributional variant of Policy Gradient. We conduct a first set of experiments over pointwise constraints showing the advantages of our approach over a set of baselines, in terms of obtaining a controlled LM balancing constraint satisfaction with divergence from the initial LM. We then perform experiments over distributional constraints, a unique feature of our approach, demonstrating its potential as a remedy to the problem of Bias in Language Models. Through an ablation study, we show the effectiveness of our adaptive technique for obtaining faster convergence. (Code available at <a class="link-external link-https" href="https://github.com/naver/gdc" rel="external noopener nofollow">this https URL</a>)

What problem does this paper attempt to address?

This paper aims to solve the problem of how to control pre - trained language models (LMs) to meet specific requirements. Specifically, researchers hope to avoid toxic content, reduce certain demographic biases, or guide the generated content towards a specific topic or style when generating text. However, existing optimization methods often lead to the "degradation" phenomenon when pursuing these goals, that is, although the generated text has an increased average reward, it loses coherence and fluency. This degradation is usually considered to be caused by deviating too far from the original pre - trained model during the optimization process. To solve these problems, the paper proposes a new method - Generation with Distributional Control (GDC), which can simultaneously meet point constraints (quality requirements for each individual output) and distribution constraints (requirements for the collective statistical properties of all generated text sets) without deviating significantly from the original pre - trained model. This is the first time that these two types of constraints have been simultaneously dealt with within a unified framework. The main contributions of the paper include: 1. Proposing a distribution view for controlling text generation, formalizing it as a constraint satisfaction problem combined with a difference - minimization objective, which is applicable to both "distribution" constraints and "point" constraints. 2. Demonstrating how these constraints lead to an optimal Energy - Based Model (EBM) of the target model, and proposing the KL - Adaptive DPG algorithm to approximate this optimal EBM distribution through an autoregressive strategy. 3. Conducting a series of experiments to evaluate the results under different point constraints and distribution constraints, especially performing better than strong baseline methods in terms of diversity, fluency, etc. compared with GPT - 2. The distribution constraint experiments show the potential of this method in solving the bias problem in pre - trained language models, providing a new direction for solving this important problem. Through these contributions, the paper provides an effective method to improve the control ability of pre - trained language models, especially in reducing social biases.

A Distributional Approach to Controlled Text Generation

Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints

Diffusion-LM Improves Controllable Text Generation

Guaranteed Generation from Large Language Models

An Invariant Learning Characterization of Controlled Text Generation

disco: a toolkit for Distributional Control of Generative Models

A Distributional Lens for Multi-Aspect Controllable Text Generation

Scaling Diffusion Language Models via Adaptation from Autoregressive Models

Diffusion Guided Language Modeling

Controllable Text Generation with Language Constraints

Teaching Others is Teaching Yourself Regularization For Controllable Language Models

Toward Controlled Generation of Text

Controllable Text Generation for Large Language Models: A Survey

Quantized Embedding Vectors for Controllable Diffusion Language Models

Controlled Training Data Generation with Diffusion Models

Evaluating, Understanding, and Improving Constrained Text Generation for Large Language Models

Controlled Text Generation via Language Model Arithmetic

Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs

Continuous Language Model Interpolation for Dynamic and Controllable Text Generation

Locate&Edit: Energy-based Text Editing for Efficient, Flexible, and Faithful Controlled Text Generation