Abstract:Deep learning with noisy labels is challenging and inevitable in many circumstances. Existing methods reduce the impact of mislabeled samples by reducing loss weights or screening, which highly rely on the model's superior discriminative power for identifying mislabeled samples. However, in the training stage, the trainee model is imperfect and will wrongly predict some mislabeled samples, which cause continuous damage to the model training. Consequently, there is a large performance gap between existing anti-noise models trained with noisy samples and models trained with clean samples. In this paper, we put forward a Gradient Switching Strategy (GSS) to prevent the continuous damage of mislabeled samples to the classifier. Theoretical analysis shows that the damage comes from the misleading gradient direction computed from the mislabeled samples. The trainee model will deviate from the correct optimization direction under the influence of the accumulated misleading gradient of mislabeled samples. To address this problem, the proposed GSS alleviates the damage by switching the gradient direction of each sample based on the gradient direction pool, which contains all-class gradient directions with different probabilities. During training, each gradient direction pool is updated iteratively, which assigns higher probabilities to potential principal directions for high-confidence samples. Conversely, uncertain samples are forced to explore in different directions rather than mislead model in a fixed direction. Extensive experiments show that GSS can achieve comparable performance with a model trained with clean data. Moreover, the proposed GSS is pluggable for existing frameworks. This idea of switching gradient directions provides a new perspective for future noisy-label learning.

How to Prevent the Continuous Damage of Noises to Model Training?