Abstract:We address the challenging problem of Long-Tailed Semi-Supervised Learning (LTSSL) where labeled data exhibit imbalanced class distribution and unlabeled data follow an unknown distribution. Unlike in balanced SSL, the generated pseudo-labels are skewed towards head classes, intensifying the training bias. Such a phenomenon is even amplified as more unlabeled data will be mislabeled as head classes when the class distribution of labeled and unlabeled datasets are mismatched. To solve this problem, we propose a novel method named ComPlementary Experts (CPE). Specifically, we train multiple experts to model various class distributions, each of them yielding high-quality pseudo-labels within one form of class distribution. Besides, we introduce Classwise Batch Normalization for CPE to avoid performance degradation caused by feature distribution mismatch between head and non-head classes. CPE achieves state-of-the-art performances on CIFAR-10-LT, CIFAR-100-LT, and STL-10-LT dataset benchmarks. For instance, on CIFAR-10-LT, CPE improves test accuracy by over 2.22% compared to baselines. Code is available at
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to improve model performance in Long - Tailed Semi - Supervised Learning (LTSSL) when the labeled data exhibits an unbalanced class distribution and the unlabeled data follows an unknown distribution. Specifically, the paper points out that in LTSSL, the generated pseudo - labels tend to be biased towards the head classes, which exacerbates the training bias. This phenomenon is more severe when the class distributions of the labeled and unlabeled data do not match, resulting in a decrease in the test accuracy of the tail classes. To solve this problem, the paper proposes a new method called Complementary Experts (CPE). By training multiple experts to model different class distributions, each expert generates high - quality pseudo - labels in a specific type of class distribution, thereby improving the overall performance.
### Background of the Paper and Problem Description
In long - tailed semi - supervised learning, the labeled data set usually exhibits the long - tailed characteristics of class distribution, that is, a few head classes contain a large number of samples, while most tail classes have fewer samples. The class distribution of the unlabeled data set may be the same as that of the labeled data set (consistent case), uniformly distributed, or opposite to the distribution of the labeled data set (reverse case). Existing LTSSL methods usually assume that the class distributions of the labeled and unlabeled data sets are similar, but in practical applications, this assumption is often violated, resulting in a decline in model performance, especially poor performance in predicting tail classes.
### Overview of the Solution
To address the above challenges, the paper proposes the Complementary Experts (CPE) method. The main contributions of CPE include:
1. **Multi - expert System**:
- **Expert Training**: CPE trains three experts, each using a different logit adjustment intensity to adapt to different types of class distributions. The first expert is trained using the regular cross - entropy loss, which is suitable for the consistent case; the second expert uses balanced logit adjustment, which is suitable for the uniform distribution case; the third expert uses reverse logit adjustment, which is suitable for the reverse case.
- **Complementary Mechanism**: In this way, CPE can generate high - quality pseudo - labels in different types of unlabeled data distributions and improve the overall performance of the model.
2. **Classwise Batch Normalization (CBN)**:
- **Feature Processing**: The CBN mechanism performs independent batch normalization processing on the features of different classes to avoid performance degradation caused by the feature distribution mismatch between head classes and non - head classes. Specifically, CBN uses three different batch normalization layers to process the features of head classes, medium classes, and tail classes respectively.
### Experimental Results
The paper conducted experiments on three benchmark data sets, CIFAR - 10 - LT, CIFAR - 100 - LT, and STL - 10 - LT, to verify the effectiveness of CPE. The experimental results show that CPE achieved the best or near - best performance in the consistent, uniform, and reverse cases, especially outstanding performance in predicting tail classes. For example, on the CIFAR - 10 - LT data set, the test accuracy of CPE in the consistent case is 0.44 percentage points higher than that of the previous state - of - the - art method (ACR).
### Conclusion
By introducing the multi - expert system and the classwise batch normalization mechanism, CPE effectively solves the class distribution mismatch problem in long - tailed semi - supervised learning, significantly improves the performance of the model in tail classes, and reaches a new state - of - the - art level.