Self-Damaging Contrastive Learning

Ziyu Jiang,Tianlong Chen,Bobak Mortazavi,Zhangyang Wang
DOI: https://doi.org/10.48550/arXiv.2106.02990
2021-06-06
Abstract:The recent breakthrough achieved by contrastive learning accelerates the pace for deploying unsupervised training on real-world data applications. However, unlabeled data in reality is commonly imbalanced and shows a long-tail distribution, and it is unclear how robustly the latest contrastive learning methods could perform in the practical scenario. This paper proposes to explicitly tackle this challenge, via a principled framework called Self-Damaging Contrastive Learning (SDCLR), to automatically balance the representation learning without knowing the classes. Our main inspiration is drawn from the recent finding that deep models have difficult-to-memorize samples, and those may be exposed through network pruning. It is further natural to hypothesize that long-tail samples are also tougher for the model to learn well due to insufficient examples. Hence, the key innovation in SDCLR is to create a dynamic self-competitor model to contrast with the target model, which is a pruned version of the latter. During training, contrasting the two models will lead to adaptive online mining of the most easily forgotten samples for the current target model, and implicitly emphasize them more in the contrastive loss. Extensive experiments across multiple datasets and imbalance settings show that SDCLR significantly improves not only overall accuracies but also balancedness, in terms of linear evaluation on the full-shot and few-shot settings. Our code is available at: <a class="link-external link-https" href="https://github.com/VITA-Group/SDCLR" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper aims to solve the performance problems of contrastive learning when dealing with long - tailed distribution data. Specifically, the paper points out that although contrastive learning has made remarkable progress in many tasks, in real - world data, due to data imbalance and the existence of long - tailed distribution, existing contrastive learning methods are still fragile. This fragility is mainly reflected in the following aspects: 1. **Data imbalance problem**: Data in the real world usually has a long - tailed distribution, that is, the number of samples in some categories is far more than that in other categories. This causes the model to be more likely to remember the frequently - occurring samples during the learning process and ignore the samples of the minority categories. 2. **Imbalance of feature representations**: Due to data imbalance, the model will tend to learn more about the feature representations of the majority categories during the learning process, while the feature representations of the minority categories are relatively weak. This further affects the performance of downstream tasks, especially in linear evaluation and few - shot learning scenarios. 3. **Lack of generalization ability**: The model has weak generalization ability on long - tailed data, especially when dealing with rare or untypical instances, the model is easy to "forget" these samples, resulting in performance degradation. ### Solutions To solve the above problems, the paper proposes a framework named **Self - Damaging Contrastive Learning (SDCLR)**. The main innovations of this framework are: - **Dynamic self - competing model**: Create a dynamic self - competing model by online pruning the target model. The pruned model is compared with the target model, thereby automatically discovering and emphasizing those samples that are easily "forgotten" (usually long - tailed samples). - **Adaptive online mining**: During the training process, SDCLR will adaptively mine the most easily forgotten samples online and implicitly increase the weights of these samples in the contrastive loss, thereby rebalancing the loss. - **No need for class information**: SDCLR does not need to know the specific class information, so it can be applied to unsupervised representation learning tasks. ### Experimental results The paper verifies the effectiveness of SDCLR through extensive experiments on multiple datasets and in unbalanced settings. The experimental results show that SDCLR not only significantly improves the overall accuracy but also shows better balance in linear evaluation in both full - sample and few - sample settings. ### Summary By proposing the SDCLR framework, the paper effectively solves the fragility problem of contrastive learning when dealing with long - tailed distribution data, providing a new solution for unsupervised representation learning.