Abstract:We introduce supervised contrastive active learning (SCAL) and propose efficient query strategies in active learning based on the feature similarity (featuresim) and principal component analysis based feature-reconstruction error (fre) to select informative data samples with diverse feature representations. We demonstrate our proposed method achieves state-of-the-art accuracy, model calibration and reduces sampling bias in an active learning setup for balanced and imbalanced datasets on image classification tasks. We also evaluate robustness of model to distributional shift derived from different query strategies in active learning setting. Using extensive experiments, we show that our proposed approach outperforms high performing compute-intensive methods by a big margin resulting in 9.9% lower mean corruption error, 7.2% lower expected calibration error under dataset shift and 8.9% higher AUROC for out-of-distribution detection.

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly include the following points: 1. **Improving the robustness and calibration of the model**: Existing active learning research mainly focuses on improving model accuracy with sample acquisition, but accuracy alone cannot reflect the performance of the model in real - world environments. Especially in the case of data distribution shift and out - of - distribution data, deep neural networks face challenges. Therefore, the author hopes to develop a method to enable the model to maintain good performance under these conditions. 2. **Reducing sampling bias**: In practical applications, the collected data sets often present a long - tail distribution, that is, the number of samples in different categories is extremely unbalanced. In addition, even for a balanced data set, sampling bias will also be introduced during the active learning process. This bias will affect the fairness, robustness and credibility of the model. For this reason, the author proposes a new active learning method to alleviate this problem. 3. **Improving computational efficiency**: Some existing sample selection methods such as CoreSet and Bayesian Active Learning by Disagreement (BALD) have high computational costs, resulting in long query times. The author aims to design a sample selection strategy that is both efficient and accurate. To solve the above problems, the author proposes Supervised Contrastive Active Learning (SCAL) and introduces two query strategies based on feature - similarity ($S_{\text{featuresim}}$) and PCA - based feature - reconstruction error ($S_{\text{fre}}$). Through these innovations, the SCAL method performs well in reducing sampling bias, improving model robustness and calibration, and computational efficiency. Specifically, the contributions of SCAL include: - Proposing two novel query strategies, combining the advantages of supervised contrastive learning. - Evaluating the performance of the model in terms of calibration and robustness, proving the superiority of this method in data distribution shift and out - of - distribution data detection. - Demonstrating the computational efficiency of this method in selecting diverse and information - rich samples, reducing sampling bias, and improving the performance of active learning on balanced and long - tail unbalanced data sets. Through these improvements, the SCAL method has not only achieved remarkable results in image classification tasks, but also provided valuable references for future research.

Robust Contrastive Active Learning with Feature-guided Query Strategies

Semisupervised SVM Batch Mode Active Learning with Applications to Image Retrieval

Adversarial Supervised Contrastive Learning

Deep Active Learning with Contrastive Learning Under Realistic Data Pool Assumptions

Semi-supervised SVM Batch Mode Active Learning for Image Retrieval

Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection

Effect of air ions on L 1210 cells: changes in fluorescence of membrane-bound 1,8-aniline-naphthalene-sulfonate (ANS) after in vitro exposure of cells to air ions.

Active learning with adaptive regularization

Understanding Contrastive Learning via Distributionally Robust Optimization

SCALP -- Supervised Contrastive Learning for Cardiopulmonary Disease Classification and Localization in Chest X-rays using Patient Metadata

Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning

Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data

CLAF: Contrastive Learning with Augmented Features for Imbalanced Semi-Supervised Learning

Nerve growth factor eye drop administrated on the ocular surface of rodents affects the nucleus basalis and septum: Biochemical and structural evidence

Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning

A Scalable Algorithm for Active Learning

Contrastive Learning With Stronger Augmentations

Clinical Contrastive Learning for Biomarker Detection

Self-Damaging Contrastive Learning

Supervised Contrastive Representation Learning: Landscape Analysis with Unconstrained Features

Adversarial Contrastive Learning by Permuting Cluster Assignments