SemiNLL: A Framework of Noisy-Label Learning by Semi-Supervised Learning

Zhuowei Wang,Jing Jiang,Bo Han,Lei Feng,Bo An,Gang Niu,Guodong Long

DOI: https://doi.org/10.48550/arXiv.2012.00925

2020-12-02

Abstract:Deep learning with noisy labels is a challenging task. Recent prominent methods that build on a specific sample selection (SS) strategy and a specific semi-supervised learning (SSL) model achieved state-of-the-art performance. Intuitively, better performance could be achieved if stronger SS strategies and SSL models are employed. Following this intuition, one might easily derive various effective noisy-label learning methods using different combinations of SS strategies and SSL models, which is, however, reinventing the wheel in essence. To prevent this problem, we propose SemiNLL, a versatile framework that combines SS strategies and SSL models in an end-to-end manner. Our framework can absorb various SS strategies and SSL backbones, utilizing their power to achieve promising performance. We also instantiate our framework with different combinations, which set the new state of the art on benchmark-simulated and real-world datasets with noisy labels.

Machine Learning,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is conducting deep - learning training in the presence of label noise. Specifically, the paper focuses on how to use semi - supervised learning (SSL) methods to mitigate the negative impact of label noise on deep neural network (DNN) training. Label noise means that the labels of some samples in the training dataset are wrong or inaccurate, which is very common in the process of constructing large - scale datasets, especially when collecting data through online search engines or crowdsourcing. Label noise can seriously affect the performance of the model. Therefore, how to effectively handle label noise has become an important research topic. The paper proposes a new framework - SemiNLL, aiming to combine sample selection (SS) strategies and semi - supervised learning models to make more efficient use of all samples, including those with potentially inaccurate labels. The core idea of the SemiNLL framework is to transform the label noise problem into a semi - supervised learning problem, that is, identify "clean" samples through sample selection strategies and use them as labeled data, while regarding other samples as unlabeled data, and then apply semi - supervised learning techniques to train the model. The advantage of this is that it can fully utilize the information of all samples while reducing the negative impact of label noise. The SemiNLL framework is highly flexible and can incorporate various sample selection strategies and semi - supervised learning models to achieve better performance. The paper also proposes two specific instantiation methods: DivideMix+ and GPL, which are based on different sample selection strategies and semi - supervised learning models respectively, and demonstrate superior performance on multiple benchmark datasets.

SemiNLL: A Framework of Noisy-Label Learning by Semi-Supervised Learning

Rethinking Noisy Label Learning in Real-world Annotation Scenarios from the Noise-type Perspective

Learning with noisy labels using collaborative sample selection and contrastive semi-supervised learning

Locating High-Density Clusters with Noisy Queries.

Adaptive Integration of Partial Label Learning and Negative Learning for Enhanced Noisy Label Learning

Teacher/Student Deep Semi-Supervised Learning for Training with Noisy Labels

Learning with Noisy Labels Via Self-supervised Adversarial Noisy Masking

Robust Noisy Label Learning via Two-Stream Sample Distillation

Combining Self-Supervised and Supervised Learning with Noisy Labels

LaSSL: Label-Guided Self-Training for Semi-supervised Learning

Meta-Semi: A Meta-learning Approach for Semi-supervised Learning.

A Survey of Label-noise Representation Learning: Past, Present and Future

SemiReward: A General Reward Model for Semi-supervised Learning

Recursive noisy label learning paradigm based on confidence measurement for semi-supervised depth completion

Fine-Grained Classification with Noisy Labels

Partial Label Learning with Noisy Side Information

Learning With Noisy Labels Over Imbalanced Subpopulations

LNL+K: Enhancing Learning with Noisy Labels Through Noise Source Knowledge Integration

BPT-PLR: A Balanced Partitioning and Training Framework with Pseudo-Label Relaxed Contrastive Loss for Noisy Label Learning

Countering Noisy Labels by Learning from Auxiliary Clean Labels

Co-LDL: A Co-Training-Based Label Distribution Learning Method for Tackling Label Noise