CroSel: Cross Selection of Confident Pseudo Labels for Partial-Label Learning

Shiyu Tian,Hongxin Wei,Yiqun Wang,Lei Feng
2024-03-27
Abstract:Partial-label learning (PLL) is an important weakly supervised learning problem, which allows each training example to have a candidate label set instead of a single ground-truth label. Identification-based methods have been widely explored to tackle label ambiguity issues in PLL, which regard the true label as a latent variable to be identified. However, identifying the true labels accurately and completely remains challenging, causing noise in pseudo labels during model training. In this paper, we propose a new method called CroSel, which leverages historical predictions from the model to identify true labels for most training examples. First, we introduce a cross selection strategy, which enables two deep models to select true labels of partially labeled data for each other. Besides, we propose a novel consistency regularization term called co-mix to avoid sample waste and tiny noise caused by false selection. In this way, CroSel can pick out the true labels of most examples with high precision. Extensive experiments demonstrate the superiority of CroSel, which consistently outperforms previous state-of-the-art methods on benchmark datasets. Additionally, our method achieves over 90\% accuracy and quantity for selecting true labels on CIFAR-type datasets under various settings.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper primarily addresses the problem of Partial-Label Learning (PLL) by proposing a new method aimed at improving the accuracy of true label identification during the training process and reducing the impact of pseudo-label noise on model training. ### Research Background and Problem - **Research Background**: Deep learning has shown outstanding performance in various application domains, but its success relies on large-scale fully labeled datasets. However, obtaining fully accurate large-scale labeled datasets in practical applications is highly challenging. Therefore, researchers have explored the weakly supervised learning problem of partial-label learning, where each training sample may have a candidate label set that includes the true label. - **Problem Faced**: In partial-label learning, each training sample has a set of candidate labels, among which only one is the true label, while the others are incorrect. This label ambiguity can negatively impact model training, especially when trying to identify the true label for each sample. ### Solution Overview The proposed method in the paper is called CroSel (Cross Selection of Confident Pseudo Labels), which mainly includes the following aspects: 1. **Cross Selection Strategy**: Two deep models are used to select true labels for each other. Specifically, if a model consistently predicts the same label with high confidence for a certain input image over a period of time, this label is considered likely to be the true label. In this way, the true labels can be accurately identified for most training samples. 2. **Consistency Regularization Term**: To further improve selection accuracy and avoid sample wastage due to incorrect selection, a consistency regularization term called co-mix is introduced. This regularization term, based on MixUp technology, generates trainable targets for all samples as an important supplement to the method. 3. **Experimental Results**: Extensive experiments demonstrate that the CroSel method exhibits superior performance on benchmark datasets, showing significant advantages over existing methods. ### Main Contributions - Proposed a cross-selection strategy that can select high-confidence pseudo labels from the candidate label set, achieving high selection accuracy and rate. - Introduced a new consistency regularization term that utilizes MixUp to enhance data and generate trainable targets for all samples. - Experiments show that CroSel outperforms existing state-of-the-art methods on commonly used benchmark datasets and provides detailed ablation studies to analyze the effects of each component of CroSel. In summary, the paper proposes an effective partial-label learning method, CroSel, aimed at addressing the label ambiguity problem in partial-label learning. Through cross-selection and consistency regularization, it improves the accuracy and quantity of true label selection during model training.