LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning

Jifan Zhang,Yifang Chen,Gregory Canal,Stephen Mussmann,Arnav M. Das,Gantavya Bhatt,Yinglun Zhu,Jeffrey Bilmes,Simon Shaolei Du,Kevin Jamieson,Robert D Nowak
2024-03-02
Abstract:Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive. To mitigate this cost, machine learning methods, such as transfer learning, semi-supervised learning and active learning, aim to be label-efficient: achieving high predictive performance from relatively few labeled examples. While obtaining the best label-efficiency in practice often requires combinations of these techniques, existing benchmark and evaluation frameworks do not capture a concerted combination of all such techniques. This paper addresses this deficiency by introducing LabelBench, a new computationally-efficient framework for joint evaluation of multiple label-efficient learning techniques. As an application of LabelBench, we introduce a novel benchmark of state-of-the-art active learning methods in combination with semi-supervised learning for fine-tuning pretrained vision transformers. Our benchmark demonstrates better label-efficiencies than previously reported in active learning. LabelBench's modular codebase is open-sourced for the broader community to contribute label-efficient learning methods and benchmarks. The repository can be found at: <a class="link-external link-https" href="https://github.com/EfficientTraining/LabelBench" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper mainly addresses the following issues: ### Research Background and Objectives - **Label Cost Issue**: In modern machine learning applications, labeled data is crucial but expensive to obtain. - **Improving Label Efficiency**: Research on how to achieve high predictive performance with fewer labeled samples. ### Solution - **LabelBench Framework**: Proposes a comprehensive and computationally efficient framework for evaluating the combined effects of various label-efficient learning techniques. - **Combining Multiple Techniques**: Integrates multiple label-efficient learning methods such as transfer learning, semi-supervised learning (Semi-SL), and active learning (AL) into a unified evaluation framework. - **For Large Pre-trained Models**: Focuses particularly on the application of these techniques on large-scale pre-trained models to achieve better label efficiency. ### Main Contributions 1. **LabelBench Framework**: A new framework for jointly evaluating multiple label-efficient learning techniques, capable of effectively handling computational challenges, especially in large-scale neural network architectures. 2. **Lightweight Retraining Scheme**: Proposes a lightweight retraining scheme based on updating only the last layer of large pre-trained models, significantly reducing training costs while maintaining most of the label efficiency gains brought by active learning. 3. **Comprehensive Experimental Results**: Demonstrates through experiments the performance of combining various deep active learning algorithms with semi-supervised learning in fine-tuning large pre-trained vision transformers. Experimental results show that this approach can significantly reduce annotation costs compared to traditional methods, especially on datasets like CIFAR-10 and ImageNet. ### Experimental Highlights - On the CIFAR-10 dataset, using active learning methods can save up to 75% of annotation costs compared to random sampling. - Under a fixed annotation budget, active learning algorithms can significantly improve test accuracy by over 1.2% and increase prediction accuracy on the unlabeled training data pool by over 5%. - Compared to previous best results, the new method improves test accuracy by at least 10% under the same settings. ### Conclusion LabelBench provides a lightweight benchmarking framework that allows researchers to test their algorithms in more realistic and larger-scale scenarios.