Abstract:Data scarcity is a major limiting factor for applying modern machine learning techniques to clinical tasks. Although sufficient data exists for some well-studied medical tasks, there remains a long tail of clinically relevant tasks with poor data availability. Recently, numerous foundation models have demonstrated high suitability for few-shot learning (FSL) and zero-shot learning (ZSL), potentially making them more accessible to practitioners. However, it remains unclear which foundation model performs best on FSL medical image analysis tasks and what the optimal methods are for learning from limited data. We conducted a comprehensive benchmark study of ZSL and FSL using 16 pretrained foundation models on 19 diverse medical imaging datasets. Our results indicate that BiomedCLIP, a model pretrained exclusively on medical data, performs best on average for very small training set sizes, while very large CLIP models pretrained on LAION-2B perform best with slightly more training samples. However, simply fine-tuning a ResNet-18 pretrained on ImageNet performs similarly with more than five training examples per class. Our findings also highlight the need for further research on foundation models specifically tailored for medical applications and the collection of more datasets to train these models.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to effectively use pre - trained foundation models for few - shot learning (FSL) and zero - shot learning (ZSL) in the case of scarce medical imaging data. Specifically, the paper aims to explore which foundation models perform best when data is very limited by comparing the FSL and ZSL performance of different pre - trained models on various medical imaging tasks, and to study the optimal learning strategies. ### Background and Motivation 1. **Data Scarcity Problem**: The application of modern machine - learning techniques in clinical tasks is severely restricted by data scarcity. Although there is sufficient data for some common medical tasks, many clinically relevant tasks lack sufficient data due to difficulties in data collection. 2. **Application Potential of Foundation Models**: In recent years, many foundation models have shown high applicability in FSL and ZSL tasks, which may make these tasks more feasible for practitioners. However, it is still unclear which foundation model performs best in FSL tasks in medical imaging analysis and the best way to learn from limited data. ### Research Objectives 1. **Benchmarking**: Evaluate the performance of 16 pre - trained foundation models in FSL and ZSL tasks by conducting extensive benchmarking on 19 different medical imaging datasets. 2. **Performance Comparison**: Determine which foundation models perform well under different numbers of training samples. 3. **Method Exploration**: Explore the effects of two adaptation strategies, linear probing and fine - tuning, on different models. ### Main Findings 1. **Performance of BiomedCLIP**: For very small training sets (less than 5 samples per category), BiomedCLIP (a model pre - trained specifically on medical data) performs best. 2. **Advantages of CLIP Models**: When the number of training samples increases slightly, large CLIP models (such as CLIP - ViT - H) show better performance. 3. **Practicality of ResNet - 18**: In the case of more than 5 training samples per category, simple ResNet - 18 fine - tuning can also achieve similar results. 4. **Relationship between Model Complexity and the Amount of Pre - trained Data**: The size of the model and the scale of the pre - trained dataset are positively correlated with FSL performance. 5. **Limitations of ZSL**: ZSL methods perform far worse than FSL methods in medical imaging tasks. ### Conclusions 1. **Optimal Strategies in the Case of Scarce Data**: In the case of very little data, using BiomedCLIP for linear probing is the best choice; as the amount of data increases, the linear probing of the CLIP - ViT - H model performs better. 2. **Model Selection and Adaptation Strategies**: Although ResNet - 18 fine - tuning performs well when there is more data, in most cases, linear probing using large foundation models is still a better choice. 3. **Future Research Directions**: There is a need for further research on foundation models specifically designed for medical applications and to collect more data to train these models. Through these studies, the paper provides practical guidance and suggestions for researchers in the medical imaging field, helping them use modern machine - learning techniques more effectively in the case of scarce data.

Navigating Data Scarcity using Foundation Models: A Benchmark of Few-Shot and Zero-Shot Learning Approaches in Medical Imaging

MedFMC: A Real-world Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification

On the Challenges and Perspectives of Foundation Models for Medical Image Analysis

Foundation AI Model for Medical Image Segmentation

Foundation Models in Radiology: What, How, When, Why and Why Not

Are Natural Domain Foundation Models Useful for Medical Image Classification?

A Clinical Benchmark of Public Self-Supervised Pathology Foundation Models

Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques

Foundational Models in Medical Imaging: A Comprehensive Survey and Future Vision

Towards Scalable Foundation Models for Digital Dermatology

A Medical Data-Effective Learning Benchmark for Highly Efficient Pre-training of Foundation Models

Foundation model for cancer imaging biomarkers

EHRSHOT: An EHR Benchmark for Few-Shot Evaluation of Foundation Models

Low-resource finetuning of foundation models beats state-of-the-art in histopathology

A comparison of few-shot and traditional named entity recognition models for medical text

Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval

Medical image foundation models in assisting diagnosis of brain tumors: a pilot study

Overcoming Data Scarcity in Biomedical Imaging with a Foundational Multi-Task Model

Leveraging Foundation Models for Content-Based Medical Image Retrieval in Radiology

Specialized Foundation Models Struggle to Beat Supervised Baselines

RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports