A Deep Model for Partial Multi-Label Image Classification with Curriculum Based Disambiguation

Feng Sun,Ming-Kun Xie,Sheng-Jun Huang
DOI: https://doi.org/10.1007/s11633-023-1439-3
2024-05-06
Abstract:In this paper, we study the partial multi-label (PML) image classification problem, where each image is annotated with a candidate label set consists of multiple relevant labels and other noisy labels. Existing PML methods typically design a disambiguation strategy to filter out noisy labels by utilizing prior knowledge with extra assumptions, which unfortunately is unavailable in many real tasks. Furthermore, because the objective function for disambiguation is usually elaborately designed on the whole training set, it can be hardly optimized in a deep model with SGD on mini-batches. In this paper, for the first time we propose a deep model for PML to enhance the representation and discrimination ability. On one hand, we propose a novel curriculum based disambiguation strategy to progressively identify ground-truth labels by incorporating the varied difficulties of different classes. On the other hand, a consistency regularization is introduced for model retraining to balance fitting identified easy labels and exploiting potential relevant labels. Extensive experimental results on the commonly used benchmark datasets show the proposed method significantly outperforms the SOTA methods.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
This paper mainly studies the problem of partial multi-label (PML) image classification, where each image has a candidate label set containing multiple relevant labels and noise labels. Existing PML methods usually use disambiguation strategies to filter out noise labels, but this approach relies on additional prior knowledge which may not be available in many practical tasks. Moreover, disambiguation objective functions designed based on the entire training set are difficult to optimize on deep models. To address these issues, the paper proposes a deep model that focuses on enhancing representation and discriminative capability for PML. The model adopts a curriculum-based disambiguation strategy to progressively identify the true labels of images based on their varying difficulty levels. Simultaneously, consistency regularization is introduced to balance the fit between the identified simple labels and the potential relevant labels. Experimental results demonstrate that the proposed method outperforms the current state-of-the-art methods on commonly used benchmark datasets. The key points of the paper are: 1. Curriculum-based learning strategy to progressively identify the true labels of images. 2. Introducing consistency regularization to prevent the model from overfitting the identified noise labels. 3. On large-scale PMLC datasets, the method improves the performance of the model without relying on prior knowledge or auxiliary information. In this way, the proposed CDCR (Curriculum based Disambiguation with Consistency Regularization) method presented in the paper achieves better performance than existing methods on benchmark datasets such as VOC and MS-COCO, demonstrating its advantages in handling image classification tasks with noisy labels.