Abstract:Uncertainty sampling is a prevalent active learning algorithm that queries sequentially the annotations of data samples which the current prediction model is uncertain about. However, the usage of uncertainty sampling has been largely heuristic: (i) There is no consensus on the proper definition of "uncertainty" for a specific task under a specific loss; (ii) There is no theoretical guarantee that prescribes a standard protocol to implement the algorithm, for example, how to handle the sequentially arrived annotated data under the framework of optimization algorithms such as stochastic gradient descent. In this work, we systematically examine uncertainty sampling algorithms under both stream-based and pool-based active learning. We propose a notion of equivalent loss which depends on the used uncertainty measure and the original loss function and establish that an uncertainty sampling algorithm essentially optimizes against such an equivalent loss. The perspective verifies the properness of existing uncertainty measures from two aspects: surrogate property and loss convexity. Furthermore, we propose a new notion for designing uncertainty measures called \textit{loss as uncertainty}. The idea is to use the conditional expected loss given the features as the uncertainty measure. Such an uncertainty measure has nice analytical properties and generality to cover both classification and regression problems, which enable us to provide the first generalization bound for uncertainty sampling algorithms under both stream-based and pool-based settings, in the full generality of the underlying model and problem. Lastly, we establish connections between certain variants of the uncertainty sampling algorithms with risk-sensitive objectives and distributional robustness, which can partly explain the advantage of uncertainty sampling algorithms when the sample size is small.

Uncertainty Sampling Based Active Learning with Diversity Constraint by Sparse Selection.

Uncertainty-Based Active Learning Via Sparse Modeling for Image Classification

Uncertainty-aware Complementary Label Queries for Active Learning

Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization

A Serial Sample Selection Framework for Active Learning.

Breaking the Barrier: Selective Uncertainty-based Active Learning for Medical Image Segmentation

Active Learning Via Sequential Design and Uncertainty Sampling

Uncertainty for Active Learning on Graphs

Unsupervised Fusion Feature Matching for Data Bias in Uncertainty Active Learning

Exploring Representativeness and Informativeness for Active Learning.

Entropy-based Active Learning for Object Detection with Progressive Diversity Constraint

Active learning with adaptive regularization

Deep Adversarial Active Learning with Model Uncertainty for Image Classification

Bridging Diversity and Uncertainty in Active learning with Self-Supervised Pre-Training

Active Learning With Sampling by Uncertainty and Density for Data Annotations

Active Learning with Sampling by Uncertainty and Density for Word Sense Disambiguation and Text Classification.

Understanding Uncertainty Sampling

Unified Locally Linear Classifiers with Diversity-Promoting Anchor Points

Semisupervised SVM Batch Mode Active Learning with Applications to Image Retrieval

Sample Diversity Selection Strategy Based on Label Distribution Morphology for Active Label Distribution Learning

Towards Better Uncertainty Sampling: Active Learning With Multiple Views For Deep Convolutional Neural Network