Class-Imbalanced Complementary-Label Learning via Weighted Loss

Meng Wei,Yong Zhou,Zhongnian Li,Xinzheng Xu
DOI: https://doi.org/10.1016/j.neunet.2023.07.030
2023-06-17
Abstract:Complementary-label learning (CLL) is widely used in weakly supervised classification, but it faces a significant challenge in real-world datasets when confronted with class-imbalanced training samples. In such scenarios, the number of samples in one class is considerably lower than in other classes, which consequently leads to a decline in the accuracy of predictions. Unfortunately, existing CLL approaches have not investigate this problem. To alleviate this challenge, we propose a novel problem setting that enables learning from class-imbalanced complementary labels for multi-class classification. To tackle this problem, we propose a novel CLL approach called Weighted Complementary-Label Learning (WCLL). The proposed method models a weighted empirical risk minimization loss by utilizing the class-imbalanced complementary labels, which is also applicable to multi-class imbalanced training samples. Furthermore, we derive an estimation error bound to provide theoretical assurance. To evaluate our approach, we conduct extensive experiments on several widely-used benchmark datasets and a real-world dataset, and compare our method with existing state-of-the-art methods. The proposed approach shows significant improvement in these datasets, even in the case of multiple class-imbalanced scenarios. Notably, the proposed method not only utilizes complementary labels to train a classifier but also solves the problem of class imbalance.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to address the challenges encountered in Complementary - Label Learning (CLL) in class - imbalanced datasets. Specifically, when the number of certain classes in training samples is far less than that of other classes, the existing CLL methods cannot effectively deal with this imbalance problem, resulting in a decline in prediction accuracy. To solve this problem, the author proposes a new method - Weighted Complementary - Label Learning (WCLL), which minimizes the total loss value by introducing a weighted loss function, thereby alleviating the impact of class imbalance. ### Problem Background In real - world datasets, class imbalance is a common and difficult problem. For example, in medical image classification, the number of normal samples is usually much larger than that of diseased samples, especially in disease detection involving privacy protection. In this case, obtaining accurate ground - truth labels can be very difficult and time - consuming. Therefore, using complementary labels, that is, specifying which class an instance does not belong to, becomes a more feasible alternative. However, class imbalance makes learning based on complementary labels more difficult because the model may be biased towards the majority class and ignore the minority class. ### Core Problem of the Paper The core problem of this paper is: how to effectively perform complementary - label learning in class - imbalanced datasets to improve the prediction accuracy of the classifier, especially the performance on the minority class. ### Solution To solve this problem, the author proposes the WCLL method, and the main contributions include: 1. **Introducing a weighted loss function**: By introducing a weight term for samples of each class, the loss function can better handle the class - imbalance problem. Specifically, the loss function is defined as: \[ \hat{\ell}(f(x), \bar{y}) = \omega \bar{\ell}(f(x), \bar{y}) \] where \(\omega\) is a K - dimensional loss weight vector, which is inversely proportional to the proportion of class imbalance. 2. **Theoretical guarantee**: The author proves that the proposed method can converge to the optimal solution and gives a theoretical analysis of the estimated error bound. Specifically, as the number of training samples \(N_l\) increases, the classification risk \(R(\hat{f})\) converges to the optimal classification risk \(R(f^*)\). 3. **Experimental verification**: Through experiments on multiple benchmark datasets (such as MNIST, CIFAR - 10), the effectiveness of the WCLL method is verified. The experimental results show that WCLL is significantly superior to the existing CLL methods in the case of class imbalance. ### Summary This paper solves the problem of complementary - label learning in class - imbalanced datasets by introducing a weighted loss function and improves the prediction accuracy of the classifier on the minority class. This method is not only applicable to multi - class classification tasks, but also provides a valuable reference for future research.