Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

Jiachen Liang,Ruibing Hou,Hong Chang,Bingpeng Ma,Shiguang Shan,Xilin Chen
2024-05-31
Abstract:Traditional semi-supervised learning (SSL) assumes that the feature distributions of labeled and unlabeled data are consistent which rarely holds in realistic scenarios. In this paper, we propose a novel SSL setting, where unlabeled samples are drawn from a mixed distribution that deviates from the feature distribution of labeled samples. Under this setting, previous SSL methods tend to predict wrong pseudo-labels with the model fitted on labeled data, resulting in noise accumulation. To tackle this issue, we propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions. SSFA decouples the prediction of pseudo-labels from the current model to improve the quality of pseudo-labels. Particularly, SSFA incorporates a self-supervised task into the SSL framework and uses it to adapt the feature extractor of the model to the unlabeled data. In this way, the extracted features better fit the distribution of unlabeled data, thereby generating high-quality pseudo-labels. Extensive experiments show that our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily focuses on the **Feature Distribution Mismatch in Semi-Supervised Learning (FDM-SSL)** problem. Traditional semi-supervised learning (SSL) methods assume that the feature distributions of labeled and unlabeled data are consistent. However, in real-world scenarios, this assumption often does not hold. When the unlabeled data comes from a different distribution than the labeled data, traditional SSL methods generate incorrect pseudo-labels, leading to noise accumulation and severely affecting model performance. To address this issue, the authors propose the **Self-Supervised Feature Adaptation (SSFA) framework**. SSFA decouples the relationship between pseudo-label prediction and the current model by introducing a self-supervised task to update the feature extractor, making it better adapt to the distribution of the unlabeled data. Specifically: 1. **Self-Supervised Task**: Incorporate a self-supervised task within the SSL framework to adapt the feature extractor to the distribution of the unlabeled data. 2. **Feature Adaptation Module**: Update the feature extractor through self-supervised learning to better match the distribution of the unlabeled data, thereby generating high-quality pseudo-labels. 3. **Experimental Validation**: Extensive experiments validate the superior performance of SSFA under different distributions, including labeled, unlabeled, and even unseen distributions. In summary, the paper aims to improve the performance of semi-supervised learning under feature distribution mismatch conditions through the SSFA framework.