Abstract:In the classification task, label noise has a significant impact on models' performance, primarily manifested in the disruption of prediction consistency, thereby reducing the classification accuracy. This work introduces a novel prediction consistency regularization that mitigates the impact of label noise on neural networks by imposing constraints on the prediction consistency of similar samples. However, determining which samples should be similar is a primary challenge. We formalize the similar sample identification as a clustering problem and employ twin contrastive clustering (TCC) to address this issue. To ensure similarity between samples within each cluster, we enhance TCC by adjusting clustering prior to distribution using label information. Based on the adjusted TCC's clustering results, we first construct the prototype for each cluster and then formulate a prototype-based regularization term to enhance prediction consistency for the prototype within each cluster and counteract the adverse effects of label noise. We conducted comprehensive experiments using benchmark datasets to evaluate the effectiveness of our method under various scenarios with different noise rates. The results explicitly demonstrate the enhancement in classification accuracy. Subsequent analytical experiments confirm that the proposed regularization term effectively mitigates noise and that the adjusted TCC enhances the quality of similar sample recognition.

What problem does this paper attempt to address?

The paper mainly addresses the issue of the impact of label noise on model performance in machine learning classification tasks. Specifically, the paper proposes a new method called Prediction Consistency Regularization (PCR) to mitigate the negative effects of noisy labeled datasets during the neural network training process. The paper points out that in classification tasks, label noise significantly affects model performance, primarily by disrupting the prediction consistency among similar samples, thereby reducing classification accuracy. To alleviate this problem, the authors' proposed method is divided into two parts: 1. **Improvement of Twin Contrastive Clustering (TCC)**: - TCC is used to identify which samples in the dataset are similar. TCC is a contrastive learning framework that can generate representations and cluster based on these representations. - The paper improves TCC to better utilize label information, i.e., considering label consistency during the clustering process to improve clustering quality. This includes constructing an "alignment matrix" to reflect the relationship between categories and clusters, and introducing a confidence threshold to filter out label information that might be misleading due to noise. 2. **Prototype-based Regularization based on Clustering Results**: - Based on the improved TCC clustering results, the paper proposes a prototype regularization term. This regularization term aims to enhance the prediction consistency within clusters by penalizing the difference between the prediction distribution of samples within the same cluster and the cluster prototype. - The cluster prototype is obtained by the weighted average of the prediction distributions of samples within each cluster, where the weights are determined by the confidence of the samples belonging to the cluster. Experimental results show that the proposed regularization method effectively improves classification accuracy under different noise rates. Additionally, subsequent analysis experiments verified that the proposed regularization term can indeed effectively mitigate the impact of noise, and the improved TCC enhances the quality of similar sample identification.

Prediction Consistency Regularization for Learning with Noise Labels Based on Contrastive Clustering

FGCM: Noisy Label Learning via Fine-Grained Confidence Modeling

Noise-resistant Graph Neural Networks with Manifold Consistency and Label Consistency

Learning with Feature-Dependent Label Noise: A Progressive Approach

Boosting Co-teaching with Compression Regularization for Label Noise

Instance-specific Label Distribution Regularization for Learning with Label Noise

Twin Contrastive Learning with Noisy Labels

Synergistic Network Learning and Label Correction for Noise-robust Image Classification

Temporal Calibrated Regularization for Robust Noisy Label Learning.

Conditional Consistency Regularization for Semi-Supervised Multi-label Image Classification

Adaptive Contrastive Learning for Learning Robust Representations under Label Noise.

Unleashing the Potential of Regularization Strategies in Learning with Noisy Labels

HLA B27-associated rheumatic diseases with severe cardiac bradyarrhythmias. Clinical features and prevalence in 223 men with permanent pacemakers.

Contrastive Learning Improves Model Robustness Under Label Noise

Pairwise Similarity Distribution Clustering for Noisy Label Learning

An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for Noisy Labels

Adaptive Regularization of Labels

Reliable Label Correction is a Good Booster When Learning with Extremely Noisy Labels.

Channel-Wise Contrastive Learning for Learning with Noisy Labels

Contrastive Learning Joint Regularization for Pathological Image Classification with Noisy Labels

Improving deep label noise learning with dual active label correction