Abstract:Background and objective: Cervical cancer poses a major health threat to women globally. Optical coherence tomography (OCT) imaging has recently shown promise for non-invasive cervical lesion diagnosis. However, obtaining high-quality labeled cervical OCT images is challenging and time-consuming as they must correspond precisely with pathological results. The scarcity of such high-quality labeled data hinders the application of supervised deep-learning models in practical clinical settings. This study addresses the above challenge by proposing CMSwin, a novel self-supervised learning (SSL) framework combining masked image modeling (MIM) with contrastive learning based on the Swin-Transformer architecture to utilize abundant unlabeled cervical OCT images. Methods: In this contrastive-MIM framework, mixed image encoding is combined with a latent contextual regressor to solve the inconsistency problem between pre-training and fine-tuning and separate the encoder's feature extraction task from the decoder's reconstruction task, allowing the encoder to extract better image representations. Besides, contrastive losses at the patch and image levels are elaborately designed to leverage massive unlabeled data. Results: We validated the superiority of CMSwin over the state-of-the-art SSL approaches with five-fold cross-validation on an OCT image dataset containing 1,452 patients from a multi-center clinical study in China, plus two external validation sets from top-ranked Chinese hospitals: the Huaxi dataset from the West China Hospital of Sichuan University and the Xiangya dataset from the Xiangya Second Hospital of Central South University. A human-machine comparison experiment on the Huaxi and Xiangya datasets for volume-level binary classification also indicates that CMSwin can match or exceed the average level of four skilled medical experts, especially in identifying high-risk cervical lesions. Conclusion: Our work has great potential to assist gynecologists in intelligently interpreting cervical OCT images in clinical settings. Additionally, the integrated GradCAM module of CMSwin enables cervical lesion visualization and interpretation, providing good interpretability for gynecologists to diagnose cervical diseases efficiently.

Disentanglement of content and style features in multi-center cytology images via contrastive self-supervised learning

Artificial Classification of Cervical Squamous Lesions in ThinPrep Cytologic Tests Using a Deep Convolutional Neural Network.

An unsupervised style normalization method for cytopathology images

Cell comparative learning: A cervical cytopathology whole slide image classification method using normal and abnormal cells

Generative and Contrastive Based Self-Supervised Learning Model for Histopathology Image Analysis.

Dual-path network with synergistic grouping loss and evidence driven risk stratification for whole slide cervical image analysis

Complementation and recombination with temperature-sensitive mutants of adenovirus type 5.

An efficient framework based on large foundation model for cervical cytopathology whole slide image screening

Whole Slide Cervical Cancer Screening Using Graph Attention Network and Supervised Contrastive Learning

A deep learning framework for predicting endometrial cancer from cytopathologic images with different staining styles

RetCCL: Clustering-guided contrastive learning for whole-slide image retrieval

Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer

Enhanced label constrained contrastive learning for chromosome optical microscopic image classification

CDDSA: Contrastive domain disentanglement and style augmentation for generalizable medical image segmentation

AF-SENet: Classification of Cancer in Cervical Tissue Pathological Images Based on Fusing Deep Convolution Features

Domain Generalization for Mammographic Image Analysis with Contrastive Learning

Multi-stage domain adversarial style reconstruction for cytopathological image stain normalization

Whole Slide Image Multi-Classification of Cervical Epithelial Lesions Based on Unsupervised Pre-training

Clustering-Guided Twin Contrastive Learning for Endomicroscopy Image Classification

Contrastive learning-based computational histopathology predict differential expression of cancer driver genes

Robust whole slide image analysis for cervical cancer screening using deep learning