Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning

Darshana Saravanan,Naresh Manwani,Vineet Gandhi
2024-05-28
Abstract:Partial label learning (PLL) is a weakly-supervised learning paradigm where each training instance is paired with a set of candidate labels (partial label), one of which is the true label. Noisy PLL (NPLL) relaxes this constraint by allowing some partial labels to not contain the true label, enhancing the practicality of the problem. Our work centres on NPLL and presents a minimalistic framework that initially assigns pseudo-labels to images by exploiting the noisy partial labels through a weighted nearest neighbour algorithm. These pseudo-label and image pairs are then used to train a deep neural network classifier with label smoothing. The classifier's features and predictions are subsequently employed to refine and enhance the accuracy of pseudo-labels. We perform thorough experiments on seven datasets and compare against nine NPLL and PLL methods. We achieve state-of-the-art results in all studied settings from the prior literature, obtaining substantial gains in fine-grained classification and extreme noise scenarios. Further, we show the promising generalisation capability of our framework in realistic crowd-sourced datasets.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses the issue of **Noisy Partial Label Learning (NPLL)**. Specifically: 1. **Partial Label Learning (PLL)**: In PLL, each training sample is associated with a set of candidate labels, one of which is the true label. This setup is common in crowdsourced datasets where annotators may label multiple possible labels when uncertain. 2. **Noisy Partial Label Learning (NPLL)**: NPLL further relaxes the constraints in PLL by allowing some partial labels to not include the true label, thereby enhancing practicality in real-world applications. Existing methods typically employ label disambiguation strategies and handle noise by detecting and mitigating noisy samples. However, these methods are prone to error propagation and require two-stage training, making it challenging to determine the length of the warm-up period. 3. **Proposed New Framework (PALS)**: The paper proposes a new framework called PALS (Pseudo-labelling And Label Smoothing), which combines pseudo-label generation and label smoothing techniques. PALS uses a weighted nearest neighbor algorithm to assign a pseudo-label to each image and trains the classifier using these pseudo-labels. Additionally, label smoothing enhances robustness against potential noise in the pseudo-labeling stage, ultimately improving classification accuracy through iterative refinement of feature representations. Through extensive experimental validation, PALS significantly outperforms nine existing NPLL and PLL methods across seven datasets, particularly excelling in fine-grained classification and high-noise scenarios. Furthermore, its performance on real-world crowdsourced datasets demonstrates PALS's strong generalization capability.