Zero-to-Strong Generalization: Eliciting Strong Capabilities of Large Language Models Iteratively without Gold Labels

Chaoqun Liu,Qin Chao,Wenxuan Zhang,Xiaobao Wu,Boyang Li,Anh Tuan Luu,Lidong Bing
2024-09-19
Abstract:Large Language Models (LLMs) have demonstrated remarkable performance through supervised fine-tuning or in-context learning using gold labels. However, this paradigm is limited by the availability of gold labels, while in certain scenarios, LLMs may need to perform tasks that are too complex for humans to provide such labels. To tackle this challenge, this study explores whether solely utilizing unlabeled data can elicit strong model capabilities. We propose a new paradigm termed zero-to-strong generalization. We iteratively prompt LLMs to annotate unlabeled data and retain high-quality labels by filtering. Surprisingly, we obverse that this iterative process gradually unlocks LLMs' potential on downstream tasks. Our experiments on extensive classification and reasoning tasks confirm the effectiveness of our proposed framework. Our analysis indicates that this paradigm is effective for both in-context learning and fine-tuning, and for various model sizes.
Computation and Language,Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily explores how to leverage the capabilities of large language models (LLMs) to accomplish complex tasks without gold standard labels. Specifically, the paper proposes a new paradigm called "zero-to-strong generalization," which iteratively prompts and filters data to gradually unleash the powerful capabilities of LLMs. #### Main Issues: 1. **Limitations of existing paradigms**: Existing methods of supervised fine-tuning or context-based learning with gold standard labels require a large number of gold standard labels, which may be difficult or impossible to obtain in some scenarios. 2. **Limitations of weak supervision**: While weak-to-strong generalization can guide strong models through weak supervision models, this approach is still limited by the capabilities of the weak supervision models and may not have available weak supervision models in some cases. #### Solution: - **Propose a zero-to-strong generalization framework**: This framework does not require gold standard labels or weak supervision models. Instead, it initializes the model with random or invalid examples and then iteratively selects high-confidence samples as new demonstration samples to gradually improve performance. - **Experimental validation**: The authors conducted extensive experiments on multiple classification tasks, extreme label classification tasks, and reasoning tasks to demonstrate the effectiveness of this framework. Additionally, this method is applicable not only to context learning but also to fine-tuning and is effective for larger-scale models as well. #### Core Contributions: - Proposed a simple and effective zero-to-strong generalization framework. - Demonstrated the effectiveness of this framework across various tasks. - Analyzed the reasons for the effectiveness of zero-to-strong generalization and found that its advantages lie in stronger models and more complex tasks.