Abstract:Due to the common content of anatomy, radiology images with their corresponding reports exhibit high similarity. Such inherent data bias can predispose automatic report generation models to learn entangled and spurious representations resulting in misdiagnostic reports. To tackle these, we propose a novel \textbf{Co}unter\textbf{F}actual \textbf{E}xplanations-based framework (CoFE) for radiology report generation. Counterfactual explanations serve as a potent tool for understanding how decisions made by algorithms can be changed by asking ``what if'' scenarios. By leveraging this concept, CoFE can learn non-spurious visual representations by contrasting the representations between factual and counterfactual images. Specifically, we derive counterfactual images by swapping a patch between positive and negative samples until a predicted diagnosis shift occurs. Here, positive and negative samples are the most semantically similar but have different diagnosis labels. Additionally, CoFE employs a learnable prompt to efficiently fine-tune the pre-trained large language model, encapsulating both factual and counterfactual content to provide a more generalizable prompt representation. Extensive experiments on two benchmarks demonstrate that leveraging the counterfactual explanations enables CoFE to generate semantically coherent and factually complete reports and outperform in terms of language generation and clinical efficacy metrics.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the data bias problem in the automatic generation of radiology reports due to the high similarity between images and reports. This data bias may cause the automatic report - generation model to learn incorrect feature representations, resulting in misdiagnosis reports. To overcome these problems, the authors propose a new framework (CoFE) based on CounterFactual Explanations (CEs) for radiology report generation. Specifically: 1. **Problem Background**: - Radiology images and their corresponding reports show high similarity due to the commonality of anatomical structures. - This inherent data bias may cause the automatic report - generation model to learn entangled and spurious feature representations, and then generate inaccurate diagnostic reports. 2. **Solutions**: - **CounterFactual Explanations (CEs)**: By constructing "what - if" scenarios, CEs can be a powerful tool for understanding how algorithmic decisions change. - **Contrastive Learning**: CoFE learns non - spurious visual features by comparing the representations of real and counterfactual images. - **Counterfactual Image Generation**: Generate counterfactual images by swapping an image patch between positive and negative samples until the predicted diagnosis result changes. - **Learnable Prompts**: CoFE uses a learnable prompt to efficiently fine - tune a pre - trained large language model (LLM), containing factual and counterfactual content, to provide a more general prompt representation. 3. **Method Overview**: - **Encoder**: Use pre - trained ViT and PubMedBERT to encode images and text respectively. - **Decoder**: Use the pre - trained GPT - 2 Medium model to generate reports. - **Counterfactual Generation Module**: Generate counterfactual images through negative sample selection strategies and image patch swapping, and calculate the contrastive loss. - **Joint Optimization**: Train the model through the joint optimization of image - report contrastive loss, report generation loss, and counterfactual loss. 4. **Experimental Results**: - Experiments on two benchmark datasets, IU - Xray and MIMIC - CXR, show that CoFE outperforms other competing methods in both natural language generation (NLG) and clinical effectiveness metrics. - Especially on the MIMIC - CXR dataset, CoFE achieves the best performance in clinical effectiveness metrics such as Precision, Recall, and F1 - score. In conclusion, this paper effectively solves the data bias problem in the automatic generation of radiology reports by introducing counterfactual explanations and contrastive learning, and improves the accuracy and clinical applicability of the generated reports.

Contrastive Learning with Counterfactual Explanations for Radiology Report Generation

An Inclusive Task-Aware Framework for Radiology Report Generation

Rethinking Radiology Report Generation via Causal Inspired Counterfactual Augmentation

Rethinking Radiology Report Generation Via Causal Reasoning and Counterfactual Augmentation.

Interactive dual-stream contrastive learning for radiology report generation

Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning

Reading Radiology Imaging Like The Radiologist

Boosting Radiology Report Generation by Infusing Comparison Prior

Radiology Report Generation via Structured Knowledge-Enhanced Multi-modal Attention and Contrastive Learning.

Graph Enhanced Contrastive Learning for Radiology Findings Summarization

MKCL: Medical Knowledge with Contrastive Learning model for radiology report generation

Robust image representations with counterfactual contrastive learning

Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation

Visual-Linguistic Causal Intervention for Radiology Report Generation

Reinforced visual interaction fusion radiology report generation

"Nothing Abnormal": Disambiguating Medical Reports via Contrastive Knowledge Infusion

Explaining the Black-box Smoothly- A Counterfactual Approach

Representative Image Feature Extraction via Contrastive Learning Pretraining for Chest X-ray Report Generation

Longitudinal Data and a Semantic Similarity Reward for Chest X-Ray Report Generation

Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report Generation

A Self-Guided Framework for Radiology Report Generation