S4: Self-Supervised Learning with Sparse-dense Sampling

Yongqin Tian,Weidong Zhang,Peng Su,Yibo Xu,Peixian Zhuang,Xiwang Xie,Wenyi Zhao
DOI: https://doi.org/10.1016/j.knosys.2024.112040
IF: 8.139
2024-01-01
Knowledge-Based Systems
Abstract:Self-supervised visual representation learning (SSL) attempts to extract significant features from unlabeled datasets, alleviating the necessity for labor-intensive and time-consuming manual labeling processes. However, existing contrastive learning-based methods typically suffer from the underutilization of datasets, consume significant computational resources, and employ longer training epochs or large batch sizes. In this study, we propose a novel method aimed at optimizing self-supervised learning that integrates the advantages of sparse-dense sampling and collaborative optimization, thereby significantly improving the performance of downstream tasks. Specifically, sparse-dense sampling primarily focuses on high-level semantic features, while leveraging the spatial structure relationship provided by the unlabeled dataset to ensure the incorporation of low-level texture features to improve data utilization. Besides, collaborative optimization, including contrastive and location tasks, further enhances the model’s ability to perceive features of different dimensions, thereby improving its utilization of features in the embedding space. Furthermore, the combination of sparse-dense sampling and collaborative optimization strategies can reduce computational consumption while improving performance. Extensive experiments demonstrate that the proposed method effectively reduces the computational requirements while delivering favorable results. The codes and model weights will be available at https://github.com/AI-TYQ/S4.
What problem does this paper attempt to address?