Abstract:Scarcity of labels for medical images is a significant barrier for training representation learning approaches based on deep neural networks. This limitation is also present when using imaging data collected during routine clinical care stored in picture archiving communication systems (PACS), as these data rarely have attached the high-quality labels required for medical image computing tasks. However, medical images extracted from PACS are commonly coupled with descriptive radiology reports that contain significant information and could be leveraged to pre-train imaging models, which could serve as starting points for further task-specific fine-tuning. In this work, we perform a head-to-head comparison of three different self-supervised strategies to pre-train the same imaging model on 3D brain computed tomography angiogram (CTA) images, with large vessel occlusion (LVO) detection as the downstream task. These strategies evaluate two natural language processing (NLP) approaches, one to extract 100 explicit radiology concepts (Rad-SpatialNet) and the other to create general-purpose radiology reports embeddings (DistilBERT). In addition, we experiment with learning radiology concepts directly or by using a recent self-supervised learning approach (CLIP) that learns by ranking the distance between language and image vector embeddings. The LVO detection task was selected because it requires 3D imaging data, is clinically important, and requires the algorithm to learn outputs not explicitly stated in the radiology report. Pre-training was performed on an unlabeled dataset containing 1,542 3D CTA - reports pairs. The downstream task was tested on a labeled dataset of 402 subjects for LVO. We find that the pre-training performed with CLIP-based strategies improve the performance of the imaging model to detect LVO compared to a model trained only on the labeled data. The best performance was achieved by pre-training using the explicit radiology concepts and CLIP strategy.

Generalized radiograph representation learning via cross-supervision between images and free-text radiology reports

A Survey of the Impact of Self-Supervised Pretraining for Diagnostic Tasks with Radiological Images

A survey of the impact of self-supervised pretraining for diagnostic tasks in medical X-ray, CT, MRI, and ultrasound

RadTex: Learning Efficient Radiograph Representations from Text Reports

Radiology Reports Improve Visual Representations Learned from Radiographs

SELF-SUPERVISED LEARNING WITH RADIOLOGY REPORTS, A COMPARATIVE ANALYSIS OF STRATEGIES FOR LARGE VESSEL OCCLUSION AND BRAIN CTA IMAGES

Textual Inversion and Self-supervised Refinement for Radiology Report Generation

Advancing Medical Radiograph Representation Learning: A Hybrid Pre-training Paradigm with Multilevel Semantic Granularity

Improving Joint Learning of Chest X-Ray and Radiology Report by Word Region Alignment

Supervised and unsupervised language modelling in Chest X-Ray radiological reports

Automated Radiological Report Generation For Chest X-Rays With Weakly-Supervised End-to-End Deep Learning

Self-supervised learning framework application for medical image analysis: a review and summary

Multifocal region-assisted cross-modality learning for chest X-ray report generation

Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation

BarlowTwins-CXR : Enhancing Chest X-Ray abnormality localization in heterogeneous data with cross-domain self-supervised learning

ReFs: A hybrid pre-training paradigm for 3D medical image segmentation

Cross Modal Global Local Representation Learning from Radiology Reports and X-Ray Chest Images

Supervised representation learning based on various levels of pediatric radiographic views for transfer learning

BarlowTwins-CXR: enhancing chest X-ray abnormality localization in heterogeneous data with cross-domain self-supervised learning

A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data

A Dual-View Approach to Classifying Radiology Reports by Co-Training