Abstract:In many classification cases, the labeled samples are difficult to acquire. However, the unlabeled samples are easy to obtain. Active learning (AL) technology can be used to resolve the labeling problem. Among numerous kinds of AL algorithms, the one that focuses on labeling the unlabeled samples within the margin band of SVM is an effective way to decrease manual labeling workload. AL needs human involvement, but the time and energy which human can provide is often limited. Therefore, there is a big restriction for sample labeling based on the AL technology. To this end, the motivation of this work is to do studies on the processing after the AL process. For the AL algorithm which focuses on exploring the unlabeled samples within the margin band of SVM, after it stops, we aim for investigating whether such unlabeled samples can continue to be explored by semi-supervised learning (SSL) or not. To design such SSL algorithm, one of the challenges is how to figure out unlabeled samples’ confidence, and then select the ones with high confidence. In this work, we proposed 3 criterions to determine confidence, i.e. 1) the smoothness assumption; 2) the explored positive samples and the explored negative samples should be similar to the labeled positive samples and the labeled negative samples as much as possible, respectively; 3) the explored positive samples and the explored negative samples should be different from the labeled negative samples and the labeled positive samples as much as possible, respectively. Based on these 3 criterions, a SSL algorithm—SSL_3C was proposed in this work. Furthermore, we applied SSL_3C to audio event classification field, and did experiments on two public datasets. Experimental results demonstrate that SSL_3C can improve the classification performance after the AL process effectively. The selected unlabeled samples are not only of high confidence, but also very informative. Moreover, SSL_3C is not sensitive to the size of labeled and unlabeled training set. The contributions of this work lie in two aspects: first, for the unlabeled samples within the margin band of SVM, we have proposed an effective SSL algorithm to explore them; second, we innovatively proposed 3 criterions to determine unlabeled samples’ confidence. Based on these 3 criterions, the explored unlabeled samples are not only of high confidence, but also very informative. Since labeling problem exists in many classification fields, and SSL_3C can effectively decrease manual labeling workload, then the proposed SSL_3C should find widespread applications in many other fields.

Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition

Semi-Supervised Dual Relation Learning for Multi-Label Classification

Leveraging Unlabeled Data for Emotion Recognition With Enhanced Collaborative Semi-Supervised Learning.

Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations

Semi-Supervised Multimodal Emotion Recognition with Class-Balanced Pseudo-labeling.

Class-Aware Contrastive Semi-Supervised Learning

Combining cross-modal knowledge transfer and semi-supervised learning for speech emotion recognition

Self-supervised Learning for Label-Efficient Sleep Stage Classification: A Comprehensive Evaluation

Boosting Semi-Supervised Learning with Contrastive Complementary Labeling

Neural collapse inspired semi-supervised learning with fixed classifier

Class Balanced Adaptive Pseudo Labeling for Federated Semi-Supervised Learning

Exploring Multi-Task Learning and Data Augmentation in Dementia Detection with Self-Supervised Pretrained Models

Multi-instance Learning for Bipolar Disorder Diagnosis Using Weakly Labelled Speech Data

Employing unlabeled data to improve the classification performance of SVM, and its application in audio event classification

Multi-objective Progressive Clustering for Semi-supervised Domain Adaptation in Speaker Verification

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation

Robust Semi-Supervised Learning when Not All Classes have Labels

DP-SSL: Towards Robust Semi-supervised Learning with A Few Labeled Samples

Suicide Risk Assessment on Social Media with Semi-Supervised Learning

Robust Deep Semi-Supervised Learning with Label Propagation and Differential Privacy