Focused Prefix Tuning for Controllable Text Generation

Congda Ma,Tianyu Zhao,Makoto Shing,Kei Sawada,Manabu Okumura
2023-06-10
Abstract:In a controllable text generation dataset, there exist unannotated attributes that could provide irrelevant learning signals to models that use it for training and thus degrade their performance. We propose focused prefix tuning(FPT) to mitigate the problem and to enable the control to focus on the desired attribute. Experimental results show that FPT can achieve better control accuracy and text fluency than baseline models in single-attribute control tasks. In multi-attribute control tasks, FPT achieves comparable control accuracy with the state-of-the-art approach while keeping the flexibility to control new attributes without retraining existing models.
Computation and Language
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the negative impact of unannotated attributes (implicit attributes) in the dataset on model training in controllable text generation tasks. Specifically, these implicit attributes may cause the model to perform poorly when controlling explicit attributes. For example, in the IMDb dataset, most of the training data has the "theme" as "technology" and the "sentiment" as "negative." This bias can lead the model to generate content with "negative sentiment" even when the target attribute is "positive sentiment." ### Solution To mitigate this issue, the authors propose the **Focused Prefix Tuning (FPT)** method. FPT encodes explicit and implicit attributes using specific and general prefixes, respectively, and suppresses the influence of implicit attributes during inference through logits operations, thereby improving the control accuracy of explicit attributes and the fluency of the text. ### Main Contributions 1. **Single Attribute Control Task**: Experimental results show that FPT achieves higher control accuracy and text fluency in single attribute control tasks. 2. **Multi-Attribute Control Task**: In multi-attribute control tasks, FPT achieves performance comparable to existing state-of-the-art methods while maintaining the flexibility to incrementally add new attributes without retraining all prefixes. 3. **Attribute Transfer Phenomenon**: FPT effectively reduces the attribute transfer phenomenon by suppressing the influence of implicit attributes, making the generated text more consistent with the target explicit attributes. ### Experimental Setup - **Datasets**: AGNews (topic control), IMDb (sentiment control), Toxic Comment Dataset (toxicity control). - **Baseline Models**: GPT-2, DExperts, GeDi, Vanilla Prefix Tuning, Contrastive Prefix Tuning, etc. - **Evaluation Metrics**: Relevance, perplexity, bias, etc. ### Experimental Results - **Single Attribute Control Task**: FPT outperforms baseline models in terms of control accuracy and text fluency. - **Multi-Attribute Control Task**: FPT shows performance comparable to state-of-the-art methods in multi-attribute control tasks and offers better flexibility. ### Limitations 1. **Hyperparameter Selection**: The hyperparameter α needs to balance control ability and text fluency. 2. **Time Cost**: The time cost of FPT is slightly higher than other prefix tuning methods but lower than GeDi. ### Conclusion FPT effectively alleviates the attribute transfer phenomenon and improves the performance of controllable text generation tasks through the focused prefix tuning method. FPT performs well in both single and multi-attribute control tasks and offers the flexibility to incrementally add new attributes.