Abstract:Large language models (LLMs) have achieved impressive success in text-formatted learning problems, and most popular LLMs have been deployed in a black-box fashion. Meanwhile, fine-tuning is usually necessary for a specific downstream task to obtain better performance, and this functionality is provided by the owners of the black-box LLMs. To fine-tune a black-box LLM, labeled data are always required to adjust the model parameters. However, in many real-world applications, LLMs can label textual datasets with even better quality than skilled human annotators, motivating us to explore the possibility of fine-tuning black-box LLMs with unlabeled data. In this paper, we propose unsupervised prompt learning for classification with black-box LLMs, where the learning parameters are the prompt itself and the pseudo labels of unlabeled data. Specifically, the prompt is modeled as a sequence of discrete tokens, and every token has its own to-be-learned categorical distribution. On the other hand, for learning the pseudo labels, we are the first to consider the in-context learning (ICL) capabilities of LLMs: we first identify reliable pseudo-labeled data using the LLM, and then assign pseudo labels to other unlabeled data based on the prompt, allowing the pseudo-labeled data to serve as in-context demonstrations alongside the prompt. Those in-context demonstrations matter: previously, they are involved when the prompt is used for prediction while they are not involved when the prompt is trained; thus, taking them into account during training makes the prompt-learning and prompt-using stages more consistent. Experiments on benchmark datasets show the effectiveness of our proposed algorithm. After unsupervised prompt learning, we can use the pseudo-labeled dataset for further fine-tuning by the owners of the black-box LLMs.

Multi-Mask Label Mapping for Prompt-Based Learning.

Label prompt for multi-label text classification

Dual Enhancement for Multi-Label Learning with Missing Labels

Prompt-Based Label-Aware Framework for Few-Shot Multi-Label Text Classification

Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks

MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning

Automatic Multi-Label Prompting: Simple and Interpretable Few-Shot Classification

Partial Multi-label Learning with Label and Feature Collaboration

On Unsupervised Prompt Learning for Classification with Black-box Language Models

Determined Multi-Label Learning via Similarity-Based Prompt

Rethinking Visual Prompt Learning as Masked Visual Token Modeling

Token-Level Contrastive Learning with Modality-Aware Prompting for Multimodal Intent Recognition

MaPLe: Multi-modal Prompt Learning

Rethinking Prompting Strategies for Multi-Label Recognition with Partial Annotations

Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels

Hierarchical Prompt Learning Using CLIP for Multi-label Classification with Single Positive Labels

TransPrompt V2: Transferable Prompt-based Fine-tuning for Few-shot Text Classification

Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning

Prompt-Based Meta-Learning for Few-shot Text Classification.

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models

Multi-modal Attribute Prompting for Vision-Language Models