Abstract:Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning. Given the advantage of prompting in the zero-shot setting and the observed performance fluctuation among different prompts, we explore the instance-level prompt and their generalizability. By searching through the prompt space, we first validate the assumption that for every instance, there is almost always a lottery prompt that induces the correct prediction from the PLM, and such prompt can be obtained at a low cost thanks to the inherent ability of PLMs. Meanwhile, we find that some strong lottery prompts have high performance over the whole training set, and they are equipped with distinguishable linguistic features. Lastly, we attempt to generalize the searched strong lottery prompts to unseen data with prompt ensembling method without any parameter tuning. Experiments are conducted on various types of NLP classification tasks and demonstrate that the proposed method can achieve comparable results with other gradient-free and optimization-free baselines.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is: without the need to update the parameters of pre - trained language models (PLMs), can at least one instance - level prompt (i.e., "lottery prompt") that can induce the correct output be found for each data point? Specifically, the researchers explored the following hypothesis: - For a given pre - trained language model and classification dataset, there exists at least one lottery prompt for each instance, which can induce the correct prediction result without updating the PLM parameters. To verify this hypothesis, the researchers conducted experiments by searching the prompt space to find lottery prompts for each data point. They selected 13 representative classification task datasets and designed a reasonable prompt search space. The experimental results show that in almost all cases, such lottery prompts can indeed be found, indicating that in a limited text prompt space, a common set of word combinations can almost always be found as prompts to make the prediction results correct. In addition, the researchers further analyzed the characteristics of these lottery prompts and their generalization ability on unseen data. They discovered some "strong prompts" that perform well on the entire training set and can be effectively generalized to test data without any parameter adjustment through a prompt integration method based on mutual information, thereby achieving performance comparable to or better than many competitive baseline methods. ### Key Conclusions 1. **Existence of lottery prompts**: For most datasets, a lottery prompt can be found for almost every data point, enabling the pre - trained language model to correctly predict the label. 2. **Low search cost**: For most datasets, the actual search cost for finding lottery prompts is far lower than the theoretical maximum, usually not exceeding 30 API calls. 3. **Impact of model capacity**: As the scale of the pre - trained model expands and the number of pre - training steps increases, the success rate of finding lottery prompts is higher and the search cost is lower. 4. **Discovery of strong prompts**: Some prompts perform excellently on the entire training set, have interpretable language features, and can be effectively generalized to unseen data through the integration method. This research demonstrates the great potential of pre - trained language models and provides new ideas for more efficient prompt search and integration methods in the future.

Exploring Lottery Prompts for Pre-trained Language Models

RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning

Effectively Prompting Small-sized Language Models for Cross-lingual Tasks via Winning Tickets

PPT: Pre-trained Prompt Tuning for Few-shot Learning

Instance-aware Prompt Learning for Language Understanding and Generation

AdaPrompt: Adaptive Model Training for Prompt-based NLP

Reliable Gradient-free and Likelihood-free Prompt Tuning

Instance-wise Prompt Tuning for Pretrained Language Models

What Makes Pre-trained Language Models Better Zero/Few-shot Learners?

Concentrate Attention: Towards Domain-Generalizable Prompt Optimization for Language Models

Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning

Demystifying Prompts in Language Models via Perplexity Estimation

Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning

Learning How to Ask: Querying LMs with Mixtures of Soft Prompts

Prompting Large Language Model for Machine Translation: A Case Study

BayesPrompt: Prompting Large-Scale Pre-Trained Language Models on Few-shot Inference via Debiased Domain Abstraction

PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine

Dialogue for Prompting: a Policy-Gradient-Based Discrete Prompt Generation for Few-shot Learning

APrompt: Attention Prompt Tuning for Efficient Adaptation of Pre-trained Language Models

The language of prompting: What linguistic properties make a prompt successful?