RegMixMatch: Optimizing Mixup Utilization in Semi-Supervised Learning

Haorong Han,Jidong Yuan,Chixuan Wei,Zhongyang Yu
2024-12-14
Abstract:Consistency regularization and pseudo-labeling have significantly advanced semi-supervised learning (SSL). Prior works have effectively employed Mixup for consistency regularization in SSL. However, our findings indicate that applying Mixup for consistency regularization may degrade SSL performance by compromising the purity of artificial labels. Moreover, most pseudo-labeling based methods utilize thresholding strategy to exclude low-confidence data, aiming to mitigate confirmation bias; however, this approach limits the utility of unlabeled samples. To address these challenges, we propose RegMixMatch, a novel framework that optimizes the use of Mixup with both high- and low-confidence samples in SSL. First, we introduce semi-supervised RegMixup, which effectively addresses reduced artificial labels purity by using both mixed samples and clean samples for training. Second, we develop a class-aware Mixup technique that integrates information from the top-2 predicted classes into low-confidence samples and their artificial labels, reducing the confirmation bias associated with these samples and enhancing their effective utilization. Experimental results demonstrate that RegMixMatch achieves state-of-the-art performance across various SSL benchmarks.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the decline in the purity of artificial labels caused by the use of the Mixup technique in existing semi - supervised learning (SSL) methods and the under - utilization of low - confidence samples. Specifically: 1. **The impact of Mixup on the purity of artificial labels**: Although Mixup performs well in consistency regularization, its application may reduce the purity of artificial labels, thus affecting the model performance. This is because when Mixup generates mixed samples and labels through interpolation, it will introduce high - entropy behavior, making the prediction probability less certain and reducing the quality of artificial labels. 2. **The utilization of low - confidence samples**: Most pseudo - label - based methods adopt a threshold strategy to exclude low - confidence data in order to reduce confirmation bias. However, this method limits the effective utilization of unlabeled samples because a large number of potentially useful low - confidence samples are discarded. To solve these problems, the paper proposes the RegMixMatch framework, which mainly includes two innovative points: - **Semi - supervised RegMixup (SRM)**: By combining unmixed samples and mixed samples for training, it effectively solves the problem of the reduction in the purity of artificial labels caused by Mixup. SRM utilizes the pseudo - label technique and weak - to - strong consistency regularization to ensure the effective utilization of high - confidence samples. - **Class - aware Mixup (CAM)**: For low - confidence samples, CAM reduces noise and improves the quality of artificial labels by mixing these samples with high - confidence samples of the same predicted category. This not only makes full use of low - confidence samples but also reduces confirmation bias. ### Formula representation To understand these methods more clearly, the following are the relevant formulas: - **Empirical Risk Minimization (ERM)**: \[ P_{\delta}(x, y)=\frac{1}{n} \sum_{i = 1}^{n} \delta(x = x_i, y = y_i) \] where \(\delta(x = x_i, y = y_i)\) represents the Dirac mass centered at \((x_i, y_i)\). - **Neighborhood distribution of Mixup**: \[ P_v(x, y)=\frac{1}{n} \sum_{i = 1}^{n} \delta(x=\bar{x}_i, y = \bar{y}_i) \] where \(\bar{x}_i=\lambda x_i+(1 - \lambda)x_j\) and \(\bar{y}_i=\lambda y_i+(1 - \lambda)y_j\). - **SRM loss function**: \[ L_u=\frac{1}{\mu B} \sum_{b = 1}^{\mu B} 1(\max(q_b)\geq\tau_c)H(\hat{q}_b, p_m(y|A(x_u^b))) \] - **Mixed - sample loss function**: \[ L_m=\frac{1}{|H|} \sum_{i = 1}^{|H|} H(\hat{q}_i\oplus\hat{q}_j, p_m(y|A(x_u^i)\oplus A(x_u^j))) \] - **Class - aware Mixup loss function**: \[ L_{cm}=\frac{1}{|H_c|} \sum_{i = 1}^{|H_c|}\|q_i\oplus\hat{q}_j - p_m(y|A(x_u^i)\oplus A(x_u^j))\|_2^2 \] Through these improvements, RegMixMatch achieves state - of - the - art performance in multiple semi - supervised learning benchmark tests.