SMCL: Saliency Masked Contrastive Learning for Long-tailed Recognition

Sanglee Park,Seung-won Hwang,Jungmin So

DOI: https://doi.org/10.1109/ICASSP49357.2023.10097143

2024-06-04

Abstract:Real-world data often follow a long-tailed distribution with a high imbalance in the number of samples between classes. The problem with training from imbalanced data is that some background features, common to all classes, can be unobserved in classes with scarce samples. As a result, this background correlates to biased predictions into ``major" classes. In this paper, we propose saliency masked contrastive learning, a new method that uses saliency masking and contrastive learning to mitigate the problem and improve the generalizability of a model. Our key idea is to mask the important part of an image using saliency detection and use contrastive learning to move the masked image towards minor classes in the feature space, so that background features present in the masked image are no longer correlated with the original class. Experiment results show that our method achieves state-of-the-art level performance on benchmark long-tailed datasets.

Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

The paper aims to address the problem of visual recognition in long-tailed distribution datasets. In real-world datasets, the number of samples in each category varies greatly, forming a long-tailed distribution. This imbalance makes it difficult for the model to correctly learn distinguishing features, especially in minority classes (categories with fewer samples). Specifically, background features should be common across different categories, but during training, due to the lack of minority class samples, these background features may be incorrectly associated with certain majority classes (categories with more samples), leading the model to bias towards these majority classes during prediction. To solve this problem, the paper proposes a new method called **Saliency Masked Contrastive Learning (SMCL)**. This method is implemented through the following steps: 1. **Saliency Masking**: Use saliency detection techniques to mask out important parts of the image, thereby retaining the background parts. 2. **Weighted Sampling**: Prioritize minority classes when selecting target labels to increase the selection probability of minority class samples. 3. **Contrastive Learning**: Use contrastive learning to align the features of masked images towards minority classes, achieving the sharing of background features in the feature space, so they no longer bias towards majority classes. Experimental results show that SMCL achieves state-of-the-art performance on multiple long-tailed benchmark datasets, with significant improvements in the performance of minority classes.

SMCL: Saliency Masked Contrastive Learning for Long-tailed Recognition

Contrastive Learning with Hallucinating Data for Long-Tailed Face Recognition.

Decoupled Contrastive Learning for Long-Tailed Recognition

Balanced Contrastive Learning for Long-Tailed Visual Recognition

Targeted Supervised Contrastive Learning for Long-Tailed Recognition

Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition

Align Yourself: Self-supervised Pre-training for Fine-grained Recognition via Saliency Alignment.

Long-Tailed Recognition by Mutual Information Maximization between Latent Features and Ground-Truth Labels

Bt-Vmf Contrastive and Collaborative Learning for Long-Tailed Visual Recognition

Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning

Saliency Guided Contrastive Learning on Scene Images

Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where

Self-Damaging Contrastive Learning

Long-Tail Learning with Rebalanced Contrastive Loss

Masked Contrastive Learning for Anomaly Detection

ECL: Class-Enhancement Contrastive Learning for Long-tailed Skin Lesion Classification

ContrastMask: Contrastive Learning to Segment Every Thing

MHCCL: Masked Hierarchical Cluster-Wise Contrastive Learning for Multivariate Time Series

SiSL-Net: Saliency-guided self-supervised learning network for image classification

Probabilistic Contrastive Learning for Long-Tailed Visual Recognition

Temporal-masked skeleton-based action recognition with supervised contrastive learning