Abstract:Practically, we are often in the dilemma that the labeled data at hand are inadequate to train a reliable classifier, and more seriously, some of these labeled data may be mistakenly labeled due to the various human factors. Therefore, this paper proposes a novel semi-supervised learning paradigm that can handle both label insufficiency and label inaccuracy. To address label insufficiency, we use a graph to bridge the data points so that the label information can be propagated from the scarce labeled examples to unlabeled examples along the graph edges. To address label inaccuracy, Graph Trend Filtering (GTF) and Smooth Eigenbase Pursuit (SEP) are adopted to filter out the initial noisy labels. GTF penalizes the l_0 norm of label difference between connected examples in the graph and exhibits better local adaptivity than the traditional l_2 norm-based Laplacian smoother. SEP reconstructs the correct labels by emphasizing the leading eigenvectors of Laplacian matrix associated with small eigenvalues, as these eigenvectors reflect real label smoothness and carry rich class separation cues. We term our algorithm as "Semi-supervised learning under Inadequate and Incorrect Supervision" (SIIS). Thorough experimental results on image classification, text categorization, and speech recognition demonstrate that our SIIS is effective in label error correction, leading to superior performance to the state-of-the-art methods in the presence of label noise and label scarcity.

Semi-Supervised Learning: the Case When Unlabeled Data is Equally Useful

Generalized entropy based semi-supervised learning

Dual-Classifier Collaborative Method Based on Semi-Supervised Active Learning

Optimal and Safe Estimation for High-Dimensional Semi-Supervised Learning

Safe semi-supervised learning: a brief introduction

Semi-supervised Distribution Learning

Budget Semi-supervised Learning

Adaptive Semi-Supervised Learning with Discriminative Least Squares Regression

Fairness in Semi-supervised Learning: Unlabeled Data Help to Reduce Discrimination

Disagreement-based Semi-supervised Learning

Semi-supervised learning with extremely sparse labeled data on multiple semi-supervised assumptions

When Semi-supervised Learning Meets Ensemble Learning

Budget Semi-supervised Learningamaces

A Review of Semi Supervised Learning Theories and Recent Advances

Structure Regularized Self-Paced Learning for Robust Semi-Supervised Pattern Classification

Fairness Constraints in Semi-supervised Learning

Improving Semi-Supervised Support Vector Machines Through Unlabeled Instances Selection.

Distributed Estimation on Semi-Supervised Generalized Linear Model.

Robust Semi-Supervised Classification for Noisy Labels Based on Self-Paced Learning

Learning with Inadequate and Incorrect Supervision

Why the pseudo label based semi-supervised learning algorithm is effective?