A whole-slide foundation model for digital pathology from real-world data

Hanwen Xu,Naoto Usuyama,Jaspreet Bagga,Sheng Zhang,Rajesh Rao,Tristan Naumann,Cliff Wong,Zelalem Gero,Javier González,Yu Gu,Yanbo Xu,Mu Wei,Wenhui Wang,Shuming Ma,Furu Wei,Jianwei Yang,Chunyuan Li,Jianfeng Gao,Jaylen Rosemon,Tucker Bower,Soohee Lee,Roshanthi Weerasinghe,Bill J. Wright,Ari Robicsek,Brian Piening,Carlo Bifulco,Sheng Wang,Hoifung Poon

DOI: https://doi.org/10.1038/s41586-024-07441-w

IF: 64.8

2024-05-23

Nature

Abstract:Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles 1,2,3 . Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context 4 . Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet 5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data 6 . With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision–language pretraining for pathology 7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.

multidisciplinary sciences

What problem does this paper attempt to address?

The paper aims to address several key challenges in digital pathology, particularly those encountered when developing and using foundational pathology models in clinical applications. Specifically: 1. **Utilization of Large-Scale Real-World Data**: Existing foundational pathology models typically rely on limited and heterogeneous datasets (such as TCGA), which may be insufficient to handle the complexity and heterogeneity in actual clinical applications. The paper proposes a new foundational model, Prov-GigaPath, pre-trained on a large-scale real-world pathology dataset from the Providence health network. 2. **Global Pattern Capture**: Existing models often treat individual image patches as independent samples, making it difficult to capture global patterns at the whole-slide level. The paper introduces the GigaPath architecture, which leverages the newly developed LongNet method to effectively capture both local and global patterns. 3. **Open-Weight Model**: Many models pre-trained on large-scale real-world patient data are not publicly available, limiting their broad applicability in clinical research and applications. The paper makes Prov-GigaPath a fully open-weight model, including source code and pre-trained weights. By addressing the above challenges, Prov-GigaPath achieves state-of-the-art performance on multiple pathology tasks and demonstrates potential in multi-modal data integration analysis.

A whole-slide foundation model for digital pathology from real-world data

Multimodal Whole Slide Foundation Model for Pathology

Virchow: A Million-Slide Digital Pathology Foundation Model

Computational Pathology at Health System Scale -- Self-Supervised Foundation Models from Three Billion Images

Towards a general-purpose foundation model for computational pathology

A pathology foundation model for cancer diagnosis and prognosis prediction

Beyond Multiple Instance Learning: Full Resolution All-In-Memory End-To-End Pathology Slide Modeling

Generating clinical-grade pathology reports from gigapixel whole slide images with HistoGPT

PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology

PathoDuet: Foundation Models for Pathological Slide Analysis of H&E and IHC Stains

Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation

PATHS: A Hierarchical Transformer for Efficient Whole Slide Image Analysis

Large scale digital prostate pathology image analysis combining feature extraction and deep neural network

A Multimodal Knowledge-enhanced Whole-slide Pathology Foundation Model

PLUTO: Pathology-Universal Transformer

PathAsst: Redefining Pathology through Generative Foundation AI Assistant for Pathology

Pan-Cancer Diagnostic Consensus Through Searching Archival Histopathology Images Using Artificial Intelligence

Task-driven Framework Using Large Models for Digital Pathology

Clinical-grade computational pathology using weakly supervised deep learning on whole slide images

A General-Purpose Self-Supervised Model for Computational Pathology

Slideflow: Deep Learning for Digital Histopathology with Real-Time Whole-Slide Visualization