Controllable Text Generation in the Instruction-Tuning Era

Dhananjay Ashok,Barnabas Poczos
2024-05-03
Abstract:While most research on controllable text generation has focused on steering base Language Models, the emerging instruction-tuning and prompting paradigm offers an alternate approach to controllability. We compile and release ConGenBench, a testbed of 17 different controllable generation tasks, using a subset of it to benchmark the performance of 9 different baselines and methods on Instruction-tuned Language Models. To our surprise, we find that prompting-based approaches outperform controllable text generation methods on most datasets and tasks, highlighting a need for research on controllable text generation with Instruction-tuned Language Models in specific. Prompt-based approaches match human performance on most stylistic tasks while lagging on structural tasks, foregrounding a need to study more varied constraints and more challenging stylistic tasks. To facilitate such research, we provide an algorithm that uses only a task dataset and a Large Language Model with in-context capabilities to automatically generate a constraint dataset. This method eliminates the fields dependence on pre-curated constraint datasets, hence vastly expanding the range of constraints that can be studied in the future.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily attempts to address several key issues in the field of controllable text generation and explores the capabilities of different methods in controlling the output of large language models (LLMs) in the era of instruction tuning. Specifically: 1. **Investigate whether the currently common controllable text generation problems still pose challenges to instruction-tuned LLMs**: The paper focuses on how to improve controllability in the context of instruction tuning, particularly in tasks such as toxicity avoidance, sentiment control, and topic control. 2. **Evaluate whether methods that enhance the controllability of base LLMs are also applicable to instruction-tuned LLMs**: The paper studies the performance of these techniques on instruction-tuned models by comparing different baseline methods and controllable text generation methods. 3. **Compare controllable text generation methods with prompt-based methods**: The paper finds that, on most datasets and tasks, prompt-based methods outperform traditional controllable text generation methods and approach human performance in style control tasks, though there is still room for improvement in structural control tasks. To achieve these goals, the paper proposes a new algorithm for automatically generating constrained datasets using only task datasets and large-scale language models with contextual learning capabilities, thereby eliminating the reliance on precompiled constrained datasets and expanding the range of constraints that can be studied in the future. Additionally, the paper constructs a benchmark named ConGenBench, which includes 17 different controllable generation tasks to systematically evaluate the performance of various methods on instruction-tuned LLMs.