On Conditional and Compositional Language Model Differentiable Prompting

Jonathan Pilault,Can Liu,Mohit Bansal,Markus Dreyer
2023-07-04
Abstract:Prompts have been shown to be an effective method to adapt a frozen Pretrained Language Model (PLM) to perform well on downstream tasks. Prompts can be represented by a human-engineered word sequence or by a learned continuous embedding. In this work, we investigate conditional and compositional differentiable prompting. We propose a new model, Prompt Production System (PRopS), which learns to transform task instructions or input metadata, into continuous prompts that elicit task-specific outputs from the PLM. Our model uses a modular network structure based on our neural formulation of Production Systems, which allows the model to learn discrete rules -- neural functions that learn to specialize in transforming particular prompt input patterns, making it suitable for compositional transfer learning and few-shot learning. We present extensive empirical and theoretical analysis and show that PRopS consistently surpasses other PLM adaptation techniques, and often improves upon fully fine-tuned models, on compositional generalization tasks, controllable summarization and multilingual translation, while needing fewer trainable parameters.
Computation and Language,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively adapt pre - trained language models (PLMs) through Conditional and Compositional Differentiable Prompting to achieve better performance in downstream tasks. Specifically, the author proposes a new model named Prompt Production System (PROPS), which can convert task instructions or input metadata into continuous prompt vectors, thereby guiding the frozen PLM to generate task - specific outputs. ### Main problems 1. **Improving PLM adaptability**: Existing PLM adaptation methods, such as full - model fine - tuning and human - engineered prompts, have some limitations. For example, full - model fine - tuning requires a large number of parameters and training data, and human - engineered prompts are very sensitive to the wording of words and perform poorly in some cases. PROPS aims to overcome these limitations through conditional and compositional differentiable prompts. 2. **Compositional generalization ability**: PLMs perform poorly in handling compositional generalization tasks, especially when facing unseen instruction combinations. PROPS enhances the model's performance in compositional generalization tasks through a modular network structure and a conditional selection mechanism. 3. **Multi - task learning and few - shot learning**: In multi - task learning and few - shot learning scenarios, the adaptability of PLMs is particularly important. PROPS improves performance in multi - task learning and few - shot learning by sharing knowledge modules between tasks. ### Solutions 1. **Conditional prompt generation**: PROPS uses a Condition Encoder to encode task instructions or input metadata into fixed - size vectors, which are used as conditional inputs to the model to generate continuous prompt vectors specific to the task and input. 2. **Modular network structure**: PROPS is based on the modular network structure of Neural Production Systems, allowing the model to learn discrete rules that can be specifically used to transform specific prompt input patterns. This modular structure enables the model to perform well in combinatorial transfer learning and few - shot learning. 3. **Conditional selection mechanism**: PROPS dynamically selects applicable rules through a conditional selection mechanism, thereby generating the most appropriate prompt vectors under different tasks and input conditions. ### Experimental results - **Compositional generalization tasks**: The experimental results on the SCAN dataset show that PROPS significantly outperforms other adaptation methods, including full - model fine - tuning, in compositional generalization tasks. - **Multi - task learning**: Experiments on the Europarl multilingual translation dataset show that PROPS performs well in few - shot multi - task learning, especially when handling cross - language translation tasks. - **Metadata conditioning**: In the news summary task, PROPS significantly improves the model's performance by generating conditional prompts and using the metadata of news articles (such as news sources and types). - **Combinatorial tasks**: In the task of abstract generalization by combining the knowledge of support tasks, PROPS demonstrates its advantages in zero - shot and few - shot learning. ### Conclusion PROPS significantly improves the adaptability and performance of pre - trained language models in a variety of tasks through conditional and compositional differentiable prompts, especially in compositional generalization, multi - task learning, and few - shot learning scenarios. This method not only reduces the number of training parameters but also improves the controllability and generalization ability of the model.