A visual-language foundation model for computational pathology
Ming Y. Lu,Bowen Chen,Drew F. K. Williamson,Richard J. Chen,Ivy Liang,Tong Ding,Guillaume Jaume,Igor Odintsov,Long Phi Le,Georg Gerber,Anil V. Parwani,Andrew Zhang,Faisal Mahmood
DOI: https://doi.org/10.1038/s41591-024-02856-4
IF: 82.9
2024-03-20
Nature Medicine
Abstract:The accelerated adoption of digital pathology and advances in deep learning have enabled the development of robust models for various pathology tasks across a diverse array of diseases and patient cohorts. However, model training is often difficult due to label scarcity in the medical domain, and a model's usage is limited by the specific task and disease for which it is trained. Additionally, most models in histopathology leverage only image data, a stark contrast to how humans teach each other and reason about histopathologic entities. We introduce CONtrastive learning from Captions for Histopathology (CONCH), a visual-language foundation model developed using diverse sources of histopathology images, biomedical text and, notably, over 1.17 million image–caption pairs through task-agnostic pretraining. Evaluated on a suite of 14 diverse benchmarks, CONCH can be transferred to a wide range of downstream tasks involving histopathology images and/or text, achieving state-of-the-art performance on histology image classification, segmentation, captioning, and text-to-image and image-to-text retrieval. CONCH represents a substantial leap over concurrent visual-language pretrained systems for histopathology, with the potential to directly facilitate a wide array of machine learning-based workflows requiring minimal or no further supervised fine-tuning.
biochemistry & molecular biology,cell biology,medicine, research & experimental