Abstract:It has been demonstrated that the art of prompt tuning is highly effective in efficiently extracting knowledge from pretrained foundation models, encompassing pretrained language models (PLMs), vision pretrained models, and vision-language (V-L) models. However, the efficacy of employing fixed soft prompts with a predetermined position for concatenation with inputs for all instances, irrespective of their inherent disparities, remains uncertain. Variables such as the position, length, and representations of prompts across diverse instances and tasks can substantially influence the performance of prompt tuning. In this context, we provide a theoretical analysis, which reveals that optimizing the position of the prompt to encompass the input can capture additional semantic information that traditional prefix or postfix prompt tuning methods fail to capture. Building upon our analysis, we present a unified dynamic prompt (DP) tuning strategy that dynamically determines different factors of prompts based on specific tasks and instances. To accomplish this, we employ a lightweight learning network with Gumble-Softmax, allowing us to learn instance-dependent guidance. Experimental results underscore the significant performance improvement achieved by dynamic prompt tuning across a wide range of tasks, including NLP tasks, vision recognition tasks, and vision-language tasks. Furthermore, we establish the universal applicability of our approach under full-data, few-shot, and multitask scenarios. Codes are available at <a class="link-external link-https" href="https://github.com/Xianjun-Yang/DPT" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper attempts to address the limitations of fixed-position, fixed-length, and fixed-representation soft prompts in prompt tuning across different tasks and instances. Specifically, existing methods typically use fixed soft prompt positions and lengths, which may lead to suboptimal utilization of the pre-trained model's knowledge, especially when handling different tasks and instances. Therefore, the paper proposes a Dynamic Prompting framework aimed at improving model performance by dynamically adjusting the position, length, and representation of prompts. ### Main Issues 1. **Limitations of Fixed Soft Prompt Position**: - Existing prompt tuning methods usually fix the soft prompt at the beginning or end of the input sequence, which may fail to capture all semantic information in the input sequence. 2. **Limitations of Fixed Soft Prompt Length**: - The length of the prompt significantly impacts model performance, but existing methods typically use a fixed prompt length, which may not be the optimal choice. 3. **Limitations of Fixed Soft Prompt Representation**: - Different tasks and instances may require different prompt representations, but existing methods usually use a fixed prompt representation, which may lead to performance degradation. ### Solution The paper proposes a unified Dynamic Prompting (DP) framework that dynamically adjusts prompts through the following strategies: 1. **Dynamic Position**: - Dynamically determine the insertion position of the soft prompt in the input sequence to better capture the semantic information of the input sequence. - Use a lightweight learning network and Gumbel-Softmax technique to learn the optimal insertion position for each task or instance. 2. **Dynamic Length**: - Dynamically adjust the length of the soft prompt to meet the needs of different tasks and instances. - Use a lightweight learning network and Gumbel-Softmax technique to learn the optimal prompt length for each task or instance. 3. **Dynamic Representation**: - Dynamically generate soft prompts from a prompt pool to more efficiently utilize the frozen language model. - Use a small network to learn the attention scores for each input instance, thereby generating new soft prompts. ### Experimental Results The paper validates the effectiveness of dynamic prompting through experiments on multiple tasks: - **NLP Tasks**: Including tasks like BoolQ, MultiRC, WiC, CB, and RTE, where dynamic prompting significantly improved model performance. - **Visual Recognition Tasks**: Using the Vision Prompt Tuning (VPT) framework, dynamic prompting achieved better results on the FGVC dataset. - **Multimodal Tasks**: Using the MaPLe framework, dynamic prompting also performed well in multimodal prompt learning tasks. ### Conclusion The Dynamic Prompting framework significantly improves model performance across various tasks by dynamically adjusting the position, length, and representation of prompts, especially for large-scale pre-trained models. This method is not only suitable for single-task settings but also for multi-task and few-shot learning scenarios, showing broad application prospects.

Dynamic Prompting: A Unified Framework for Prompt Tuning

Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion

APrompt: Attention Prompt Tuning for Efficient Adaptation of Pre-trained Language Models

Unified Vision and Language Prompt Learning

Pro-tuning: Unified Prompt Tuning for Vision Tasks

Instance-wise Prompt Tuning for Pretrained Language Models

InfoPrompt: Information-Theoretic Soft Prompt Tuning for Natural Language Understanding

Multitask Vision-Language Prompt Tuning

Improving Prompt Tuning with Learned Prompting Layers

IDPG: An Instance-Dependent Prompt Generation Method

PPT: Pre-trained Prompt Tuning for Few-shot Learning

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

FPT: Improving Prompt Tuning Efficiency Via Progressive Training.

LIPT: Improving Prompt Tuning with Late Inception Reparameterization

Towards Unified Prompt Tuning for Few-shot Text Classification

Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

On Transferability of Prompt Tuning for Natural Language Processing

MuDPT: Multi-modal Deep-symphysis Prompt Tuning for Large Pre-trained Vision-Language Models

Revisiting the Power of Prompt for Visual Tuning

Structured Prompt Tuning