Noisy Label Classification using Label Noise Selection with Test-Time Augmentation Cross-Entropy and NoiseMix Learning

Hansang Lee,Haeil Lee,Helen Hong,Junmo Kim
DOI: https://doi.org/10.1007/978-3-031-17027-0_8
2024-07-17
Abstract:As the size of the dataset used in deep learning tasks increases, the noisy label problem, which is a task of making deep learning robust to the incorrectly labeled data, has become an important task. In this paper, we propose a method of learning noisy label data using the label noise selection with test-time augmentation (TTA) cross-entropy and classifier learning with the NoiseMix method. In the label noise selection, we propose TTA cross-entropy by measuring the cross-entropy to predict the test-time augmented training data. In the classifier learning, we propose the NoiseMix method based on MixUp and BalancedMix methods by mixing the samples from the noisy and the clean label data. In experiments on the ISIC-18 public skin lesion diagnosis dataset, the proposed TTA cross-entropy outperformed the conventional cross-entropy and the TTA uncertainty in detecting label noise data in the label noise selection process. Moreover, the proposed NoiseMix not only outperformed the state-of-the-art methods in the classification performance but also showed the most robustness to the label noise in the classifier learning.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily addresses the issue of noisy labels in deep learning by proposing a novel solution. When training deep learning models on large-scale datasets, the datasets often contain a certain proportion of incorrectly labeled data (noisy labels), which can severely affect the model's learning effectiveness and generalization ability. To solve this problem, the authors propose a two-stage method: 1. **Label Noise Selection**: First, a technique called "Test-Time Augmentation Cross-Entropy" (TTA Cross-Entropy) is used to identify and separate noisy label data. Specifically, this method includes the following steps: - **Warm-up**: Train a weak classifier to predict the labels of the training data; - **Test-Time Augmentation and Weak Classifier Prediction**: Augment the training data and use the weak classifier obtained in the warm-up stage to make predictions; - **TTA Cross-Entropy Calculation**: Based on the weak classifier's predictions on the augmented data, calculate the TTA cross-entropy to assess the correctness of the labels. 2. **Classifier Training**: Use a technique called "NoiseMix" to train the classifier. NoiseMix combines MixUp and BalancedMix methods by mixing samples of noisy label data and clean label data, thereby improving the model's robustness to noisy labels. In the experimental section, the authors validated the effectiveness of the proposed solution on the ISIC-18 skin lesion diagnosis dataset. The experimental results show that the proposed TTA Cross-Entropy outperforms traditional cross-entropy and TTA uncertainty in selecting noisy labels; and the proposed NoiseMix technique not only surpasses existing techniques in classification performance but also demonstrates stronger robustness when facing noisy labels. In summary, this paper aims to improve the handling of noisy label data by introducing TTA Cross-Entropy and NoiseMix techniques, thereby enhancing the learning efficiency and performance of deep learning models on datasets with noisy labels.