Improving Fine-tuning of Self-supervised Models with Contrastive Initialization

Haolin Pan,Yong Guo,Qinyi Deng,Haomin Yang,Jian Chen,Yiqun Chen
DOI: https://doi.org/10.1016/j.neunet.2022.12.012
IF: 7.8
2022-01-01
Neural Networks
Abstract:Self-supervised learning (SSL) has achieved remarkable performance in pre-training the models that can be further used in downstream tasks via fine-tuning. However, these self-supervised models may not capture meaningful semantic information since the images belonging to the same class are often regarded as negative pairs in the contrastive loss. Consequently, the images of the same class are often located far away from each other in the learned feature space, which would inevitably hamper the fine-tuning process. To address this issue, we seek to explicitly enhance the semantic relation among instances on the targeted downstream task and provide a better initialization for the subsequent fine-tuning. To this end, we propose a Contrastive Initialization (COIN) method that breaks the standard fine-tuning pipeline by introducing an extra class-aware initialization stage before fine-tuning. Specifically, we exploit a supervised contrastive loss to increase inter-class discrepancy and intra-class compactness of features on the target dataset. In this way, self-supervised models can be easily trained to discriminate instances of different classes during the final fine-tuning stage. Extensive experiments show that, with the enriched semantics, our COIN significantly outperforms existing methods without introducing extra training cost and sets new state-of-the-arts on multiple downstream tasks. For example, compared with the baseline method, our COIN improves the accuracy by 5% on ImageNet-20 and 2.57% on CIFAR100, respectively.
What problem does this paper attempt to address?