Abstract:Transfer learning enables to re-use knowledge learned on a source task to help learning a target task. A simple form of transfer learning is common in current state-of-the-art computer vision models, i.e. pre-training a model for image classification on the ILSVRC dataset, and then fine-tune on any target task. However, previous systematic studies of transfer learning have been limited and the circumstances in which it is expected to work are not fully understood. In this paper we carry out an extensive experimental exploration of transfer learning across vastly different image domains (consumer photos, autonomous driving, aerial imagery, underwater, indoor scenes, synthetic, close-ups) and task types (semantic segmentation, object detection, depth estimation, keypoint detection). Importantly, these are all complex, structured output tasks types relevant to modern computer vision applications. In total we carry out over 2000 transfer learning experiments, including many where the source and target come from different image domains, task types, or both. We systematically analyze these experiments to understand the impact of image domain, task type, and dataset size on transfer learning performance. Our study leads to several insights and concrete recommendations: (1) for most tasks there exists a source which significantly outperforms ILSVRC'12 pre-training; (2) the image domain is the most important factor for achieving positive transfer; (3) the source dataset should \emph{include} the image domain of the target dataset to achieve best results; (4) at the same time, we observe only small negative effects when the image domain of the source task is much broader than that of the target; (5) transfer across task types can be beneficial, but its success is heavily dependent on both the source and target task types.

Efficient Conditional Pre-training for Transfer Learning

Exploring the Limits of Weakly Supervised Pretraining

The Role of Pre-training Data in Transfer Learning

Efficient Neural Network Training via Subset Pretraining

Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning

Boost Supervised Pretraining for Visual Transfer Learning: Implications of Self-Supervised Contrastive Representation Learning.

Towards Inadequately Pre-trained Models in Transfer Learning

Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models

Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability

Enhancing pretraining efficiency for medical image segmentation via transferability metrics

Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types

ImageNet-21K Pretraining for the Masses

Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection

SEPT: Towards Scalable and Efficient Visual Pre-Training

Unsupervised Pre-Trained Filter Learning Approach for Efficient Convolution Neural Network.

What makes ImageNet good for transfer learning?

Rethinking Training from Scratch for Object Detection

Efficient Transferability Assessment for Selection of Pre-trained Detectors

Towards Compute-Optimal Transfer Learning

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

On the Surprising Effectiveness of Attention Transfer for Vision Transformers