Abstract:Semi-supervised learning (SSL) seeks to enhance task performance by training on both labeled and unlabeled data. Mainstream SSL image classification methods mostly optimize a loss that additively combines a supervised classification objective with a regularization term derived solely from unlabeled data. This formulation neglects the potential for interaction between labeled and unlabeled images. In this paper, we introduce InterLUDE, a new approach to enhance SSL made of two parts that each benefit from labeled-unlabeled interaction. The first part, embedding fusion, interpolates between labeled and unlabeled embeddings to improve representation learning. The second part is a new loss, grounded in the principle of consistency regularization, that aims to minimize discrepancies in the model's predictions between labeled versus unlabeled inputs. Experiments on standard closed-set SSL benchmarks and a medical SSL task with an uncurated unlabeled set show clear benefits to our approach. On the STL-10 dataset with only 40 labels, InterLUDE achieves 3.2% error rate, while the best previous method reports 14.9%.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to more effectively utilize the interaction between labeled data and unlabeled data in semi - supervised learning (SSL) to improve the performance of the model. Specifically, existing SSL methods usually train the model by optimizing a loss function that combines a supervised classification objective and a regularization term derived only from unlabeled data, but this method ignores the potential interaction between labeled data and unlabeled data. This lack of in - depth interaction limits the potential of unlabeled data. To solve this problem, the paper introduces a new SSL algorithm - InterLUDE, which consists of two main parts: 1. **Embedding Fusion**: Improve representation learning by interpolating between labeled and unlabeled embedding vectors. This part enables the model to better capture the internal structure of data by directly mixing the feature representations of labeled and unlabeled data during the training process. 2. **Cross - Instance Delta Consistency Loss**: Based on the consistency regularization principle, this loss term aims to minimize the prediction differences between the model on labeled and unlabeled inputs. Specifically, it ensures that under the same augmentation changes, the prediction changes of the model for labeled and unlabeled samples remain consistent. Through these two parts, InterLUDE promotes the direct interaction between labeled data and unlabeled data, thereby significantly improving the performance of the model. Experimental results show that InterLUDE has obvious advantages in both standard closed - set SSL benchmark tests and a medical SSL task that includes an uncurated unlabeled set. For example, on the STL - 10 dataset, when there are only 40 labels, the error rate of InterLUDE is only 3.2%, while the error rate reported by the previous best method is 14.9%. This indicates that InterLUDE not only performs well in closed - set tasks but also has good generalization ability in open - set tasks.

InterLUDE: Interactions between Labeled and Unlabeled Data to Enhance Semi-Supervised Learning

Category-Level Regularized Unlabeled-to-Labeled Learning for Semi-supervised Prostate Segmentation with Multi-site Unlabeled Data

Interpolation-Based Contrastive Learning for Few-Label Semi-Supervised Learning

Safe Deep Semi-Supervised Learning for Unseen-Class Unlabeled Data.

Exploration and Exploitation of Unlabeled Data for Open-Set Semi-Supervised Learning

Fix-A-Step: Semi-supervised Learning from Uncurated Unlabeled Data

Boosting Semi-Supervised Learning with Dual-Threshold Screening and Similarity Learning

Improving Barely Supervised Learning by Discriminating Unlabeled Samples with Super-Class

LaSSL: Label-Guided Self-Training for Semi-supervised Learning

Learning Where to Learn in Cross-View Self-Supervised Learning

Privileged Semi-Supervised Learning

Deep Growing Learning.

DualMatch: Robust Semi-Supervised Learning with Dual-Level Interaction

Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of Semi-Supervised Learning and Active Learning

Semi-supervised Learning with Easy Labeled Data via Impartial Labeled Set Extension

AuxMix: Semi-Supervised Learning with Unconstrained Unlabeled Data

Meta-Semi: A Meta-learning Approach for Semi-supervised Learning.

SAFER-STUDENT for Safe Deep Semi-Supervised Learning with Unseen-Class Unlabeled Data

Boosting Semi-Supervised Learning with Contrastive Complementary Labeling

Adaptive Semi-Supervised Mixup with Implicit Label Learning and Sample Ratio Balancing

Semi-supervised Learning Regularized by Adversarial Perturbation and Diversity Maximization.