Self-supervised visual learning in the low-data regime: a comparative evaluation

Sotirios Konstantakos,Despina Ioanna Chalkiadaki,Ioannis Mademlis,Yuki M. Asano,Efstratios Gavves,Georgios Th. Papadopoulos

2024-04-26

Abstract:Self-Supervised Learning (SSL) is a valuable and robust training methodology for contemporary Deep Neural Networks (DNNs), enabling unsupervised pretraining on a `pretext task' that does not require ground-truth labels/annotation. This allows efficient representation learning from massive amounts of unlabeled training data, which in turn leads to increased accuracy in a `downstream task' by exploiting supervised transfer learning. Despite the relatively straightforward conceptualization and applicability of SSL, it is not always feasible to collect and/or to utilize very large pretraining datasets, especially when it comes to real-world application settings. In particular, in cases of specialized and domain-specific application scenarios, it may not be achievable or practical to assemble a relevant image pretraining dataset in the order of millions of instances or it could be computationally infeasible to pretrain at this scale. This motivates an investigation on the effectiveness of common SSL pretext tasks, when the pretraining dataset is of relatively limited/constrained size. In this context, this work introduces a taxonomy of modern visual SSL methods, accompanied by detailed explanations and insights regarding the main categories of approaches, and, subsequently, conducts a thorough comparative experimental evaluation in the low-data regime, targeting to identify: a) what is learnt via low-data SSL pretraining, and b) how do different SSL categories behave in such training scenarios. Interestingly, for domain-specific downstream tasks, in-domain low-data SSL pretraining outperforms the common approach of large-scale pretraining on general datasets. Grounded on the obtained results, valuable insights are highlighted regarding the performance of each category of SSL methods, which in turn suggest straightforward future research directions in the field.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the evaluation of the effectiveness of self - supervised learning (SSL) methods in the case of limited data volume. Specifically, the paper focuses on how different SSL pre - training tasks perform in visual representation learning when the scale of the pre - training data set is relatively small (for example, 50,000 to 300,000 images). This research motivation comes from the fact that in many application fields in the real world (such as medical imaging), it is difficult to collect large - scale data sets, even if these data do not need to be manually labeled. In addition, even if a large amount of data can be collected, due to the limitation of computing resources, it may not be possible to perform pre - training on such a large - scale data. The main objectives of the paper include: 1. **Explore what can be learned from SSL pre - training with a low data volume**: By performing SSL pre - training on small - scale data sets, study which useful features or representations these models can learn. 2. **Differences in the performance of different SSL methods in low - data - volume scenarios**: Compare the performance of different types of SSL methods (such as contrastive learning, generative learning, clustering, and self - distillation) under low - data - volume conditions to understand which method is more suitable for this scenario. Through these studies, the paper hopes to provide valuable insights for researchers working in specific fields (such as X - ray image analysis), where it is usually difficult to obtain a large amount of data, even unlabeled data. This will not only help optimize the performance of existing models, but may also provide guidance for future research directions.

Self-supervised visual learning in the low-data regime: a comparative evaluation

Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive

A Survey of the Self Supervised Learning Mechanisms for Vision Transformers

Exploring Self-Supervised Representation Learning For Low-Resource Medical Image Analysis

In-Domain Self-Supervised Learning Improves Remote Sensing Image Scene Classification

A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends

DATA: Domain-Aware and Task-Aware Self-supervised Learning

Self-Supervised Learning for Real-World Object Detection: a Survey

A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification

Rethinking Self-Supervised Learning: Small is Beautiful

Semi-Supervised and Unsupervised Deep Visual Learning: A Survey

A Survey of Self-Supervised Learning from Multiple Perspectives: Algorithms, Theory, Applications and Future Trends

Self-Supervised Learning in Remote Sensing: A review

Self-supervised Learning: A Succinct Review

Transfer Learning or Self-supervised Learning? A Tale of Two Pretraining Paradigms

Self-Supervised Learning Featuring Small-Scale Image Dataset for Treatable Retinal Diseases Classification

Self-Supervised Learning for Endoscopic Video Analysis

Self-Supervised Learning on Small In-Domain Datasets Can Overcome Supervised Learning in Remote Sensing

Self-supervised Learning is More Robust to Dataset Imbalance

Progress and Thinking on Self-Supervised Learning Methods in Computer Vision: A Review