Abstract:Objectives: To build self-supervised foundation models for multicontrast MRI of the whole brain and evaluate their efficacy in assisting diagnosis of brain tumors. Methods: In this retrospective study, foundation models were developed using 57,621 enhanced head MRI scans through self-supervised learning with a pretext task of cross-contrast context restoration with two different content dropout schemes. Downstream classifiers were constructed based on the pretrained foundation models and fine-tuned for brain tumor detection, discrimination, and molecular status prediction. Metrics including accuracy, sensitivity, specificity, and area under the ROC curve (AUC) were used to evaluate the performance. Convolutional neural networks trained exclusively on downstream task data were employed for comparative analysis. Results: The pretrained foundation models demonstrated their ability to extract effective representations from multicontrast whole-brain volumes. The best classifiers, endowed with pretrained weights, showed remarkable performance with accuracies of 94.9, 92.3, and 80.4%, and corresponding AUC values of 0.981, 0.972, and 0.852 on independent test datasets in brain tumor detection, discrimination, and molecular status prediction, respectively. The classifiers with pretrained weights outperformed the convolutional classifiers trained from scratch by approximately 10% in terms of accuracy and AUC across all tasks. The saliency regions in the correctly predicted cases are mainly clustered around the tumors. Classifiers derived from the two dropout schemes differed significantly only in the detection of brain tumors. Conclusions: Foundation models obtained from self-supervised learning have demonstrated encouraging potential for scalability and interpretability in downstream brain tumor-related tasks and hold promise for extension to neurological diseases with diffusely distributed lesions. Clinical relevance statement: The application of our proposed method to the prediction of key molecular status in gliomas is expected to improve treatment planning and patient outcomes. Additionally, the foundation model we developed could serve as a cornerstone for advancing AI applications in the diagnosis of brain-related diseases.

Building Universal Foundation Models for Medical Image Analysis with Spatially Adaptive Networks

Integrating Spatial Prior Adapter for Enhancing SAM Performance in Medical Image Segmentation

Many Birds, One Stone: Medical Image Segmentation with Multiple Partially Labeled Datasets

MedFMC: A Real-world Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification

Universal Model for 3D Medical Image Analysis

Unified Medical Image Pre-training in Language-Guided Common Semantic Space

Foundation AI Model for Medical Image Segmentation

Spatial-Frequency Dual Progressive Attention Network For Medical Image Segmentation

Swin-UMamba†: Adapting Mamba-based Vision Foundation Models for Medical Image Segmentation

Text-guided Foundation Model Adaptation for Long-Tailed Medical Image Classification

Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains

Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training

Generative ConvNet Foundation Model With Sparse Modeling and Low-Frequency Reconstruction for Remote Sensing Image Interpretation

Medical image foundation models in assisting diagnosis of brain tumors: a pilot study

xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart

Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography

On the Challenges and Perspectives of Foundation Models for Medical Image Analysis

MEW-UNet: Multi-axis representation learning in frequency domain for medical image segmentation

MIS-FM: 3D Medical Image Segmentation using Foundation Models Pretrained on a Large-Scale Unannotated Dataset

VISTA3D: A Unified Segmentation Foundation Model For 3D Medical Imaging

E3D-GPT: Enhanced 3D Visual Foundation for Medical Vision-Language Model