Abstract:Semi-supervised learning (SSL) is a widely used model training paradigm that effectively utilizes a limited set of labeled data and a substantially larger pool of unlabeled data. Historically, the focus of SSL research has predominantly been on classification tasks, employing methods such as consistency regularization and pseudo-labeling. However, the direct application of these methods to regression tasks presents significant challenges, primarily due to the complexities associated with evaluating the reliability of pseudo-labels in a regression context. This paper introduces SimRegMatch, a novel semi-supervised regression (SSR) framework devised to overcome this specific challenge, by combining an uncertainty-based filtering mechanism with a similarity-based pseudo-label calibration approach. The former component is tasked with discerning which unlabeled examples possess pseudo-labels of sufficient reliability, achieved through the estimation of uncertainty levels. The latter component then refines these pseudo-labels by propagating information from labeled to unlabeled examples, thereby enhancing the overall quality of the pseudo-labels. The efficacy of SimRegMatch was rigorously tested through experiments conducted on the publicly available AgeDB dataset, which is centered around age prediction, as well as on a practical regression problem focused on the detection of interior noise levels in automobiles using accelerometer data. When benchmarked against current state-of-the-art methods in semi-supervised regression, SimRegMatch exhibited notable improvements in regression performance. Additionally, a series of ablation studies were carried out to dissect and understand the specific elements of the framework that were instrumental in achieving these performance enhancements. SimRegMatch addresses a pivotal issue in semi-supervised regression - the assessment of regression pseudo-label reliability - and substantially elevates model performance. By combining the strengths of uncertainty estimation and pseudo-label calibration, SimRegMatch emerges as a robust and versatile framework with significant potential for broad applicability in various SSR scenarios. A PyTorch implementation is publicly available at https://github.com/YongwonJo/SimRegMatch .

Deep semi-supervised regression via pseudo-label filtering and calibration

SemiReward: A General Reward Model for Semi-supervised Learning

Towards Self-Adaptive Pseudo-Label Filtering for Semi-Supervised Learning

Semi-supervised regression via embedding space mapping and pseudo-label smearing

Meta-Semi: A Meta-learning Approach for Semi-supervised Learning.

Meta pseudo label tabular-related regression model for surrogate modeling

Mixed Semi-Supervised Generalized-Linear-Regression with applications to Deep-Learning and Interpolators

LaSSL: Label-Guided Self-Training for Semi-supervised Learning

Boosting Semi-Supervised Learning by bridging high and low-confidence predictions

Leveraging Local Variance for Pseudo-Label Selection in Semi-supervised Learning

Learning Safe Prediction for Semi-Supervised Regression

Towards the Mitigation of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective

Semi-Supervised Empirical Risk Minimization: Using unlabeled data to improve prediction

Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data

Learning Label Refinement and Threshold Adjustment for Imbalanced Semi-Supervised Learning

SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning

An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised Learning

Reinforcement Learning-Guided Semi-Supervised Learning

Feature Space Renormalization for Semi-supervised Learning

In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning

Revisiting Deep Semi-supervised Learning: An Empirical Distribution Alignment Framework and Its Generalization Bound