Abstract:Background Gleason grading remains the gold standard for prostate cancer histological classification and prognosis, yet its subjectivity leads to grade variability between pathologists, potentially impacting clinical decision-making. Artificial intelligence (AI), particularly self-supervised vision transformer (ViT) architecture, can enhance diagnostic accuracy and consistency for prostate cancer. We trained and validated a generalised AI-driven system for diagnosing prostate cancer using diverse datasets from tissue microarray (TMA) core and whole slide images (WSIs) with Hematoxylin and Eosin staining. Methods We analysed eight prostate cancer datasets, which included 12,711 histological images from 3,648 patients, incorporating TMA core images and WSIs. Patches were extracted with 512 x 512 pixels size at 10x magnification from histological images with their corresponding mask annotations from segmentation data. The Macenko method was used to normalise colours for consistency across diverse images. Subsequently, we trained a multi-resolution (5x, 10x, 20x, and 40x) binary classifier to identify benign and malignant tissue. We then implemented a multi-class classifier for Gleason patterns (GP) sub-categorisation from malignant tissue. Finally, the models were externally validated on 11,132 histology images from 2,176 patients to determine the International Society of Urological Pathology (ISUP) grade. Models were assessed using various classification metrics, and the agreement between the model's predictions and the ground truth was quantified using the quadratic weighted Cohen's Kappa (k) score. Results Our multi-resolution binary classifier demonstrated robust performance in distinguishing malignant from benign tissue with k scores of 0.967 on internal validation. The model achieved k scores ranging from 0.876 to 0.995 across four unseen testing datasets. The multi-class classifier also performed well in distinguishing GP3, GP4, and GPs with an overall k score of 0.841. This model was further tested across four datasets, obtaining k scores ranging from 0.774 to 0.888. Attention maps generated by both classifiers revealed the clinical features of GPs. The models' performance was compared against an independent pathologist's annotation on an external dataset, achieving a k score of 0.752 for four classes. Conclusion Our self-supervised ViT-based model demonstrates high utility in diagnosing and grading prostate cancer using histological images. The models exhibit robust performance in categorising benign and malignant tissues, further differentiating malignant tissue based on the aggressiveness of cancer. Attention maps exposed high coherence with expert-confirmed pathological features. External validation reflects the robustness and generalizability of the models across different datasets, highlighting their clinical applicability for diagnosing and grading prostate cancer as a tool in digital pathology.

High-Performance Classification of Breast Cancer Histopathological Images Using Fine-Tuned Vision Transformers on the BreakHis Dataset

Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification

Implementing vision transformer for classifying 2D biomedical images

DCT-HistoTransformer: Efficient Lightweight Vision Transformer with DCT Integration for histopathological image analysis

Vision transformer-convolution for breast cancer classification using mammography images: A comparative study

Vision transformer introduces a new vitality to the classification of renal pathology

Fine tuning deep learning models for breast tumor classification

Vision Transformer for Classification of Breast Ultrasound Images

RDTNet: A residual deformable attention based transformer network for breast cancer classification

ViT-CB: Integrating hybrid Vision Transformer and CatBoost to enhanced brain tumor detection with SHAP

Semi-supervised vision transformer with adaptive token sampling for breast cancer classification

Towards efficient diagnostics: refining vision transformers for medical image multi-label classification

Vision Transformers for Computational Histopathology

Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-Based Noninvasive Digital System

BUViTNet: Breast Ultrasound Detection via Vision Transformers

Vision Transformers for Small Histological Datasets Learned through Knowledge Distillation

CViTS-Net: A CNN-ViT Network With Skip Connections for Histopathology Image Classification

Pathological Insights: Enhanced Vision Transformers for the Early Detection of Colorectal Cancer

Vision Transformer for Efficient Chest X-ray and Gastrointestinal Image Classification

A generalised vision transformer-based self-supervised model for diagnosing and grading prostate cancer using histological images

Data-Efficient Vision Transformers for Multi-Label Disease Classification on Chest Radiographs