Contrastive Learning with Counterfactual Explanations for Radiology Report Generation

Mingjie Li,Haokun Lin,Liang Qiu,Xiaodan Liang,Ling Chen,Abdulmotaleb Elsaddik,Xiaojun Chang
2024-07-20
Abstract:Due to the common content of anatomy, radiology images with their corresponding reports exhibit high similarity. Such inherent data bias can predispose automatic report generation models to learn entangled and spurious representations resulting in misdiagnostic reports. To tackle these, we propose a novel \textbf{Co}unter\textbf{F}actual \textbf{E}xplanations-based framework (CoFE) for radiology report generation. Counterfactual explanations serve as a potent tool for understanding how decisions made by algorithms can be changed by asking ``what if'' scenarios. By leveraging this concept, CoFE can learn non-spurious visual representations by contrasting the representations between factual and counterfactual images. Specifically, we derive counterfactual images by swapping a patch between positive and negative samples until a predicted diagnosis shift occurs. Here, positive and negative samples are the most semantically similar but have different diagnosis labels. Additionally, CoFE employs a learnable prompt to efficiently fine-tune the pre-trained large language model, encapsulating both factual and counterfactual content to provide a more generalizable prompt representation. Extensive experiments on two benchmarks demonstrate that leveraging the counterfactual explanations enables CoFE to generate semantically coherent and factually complete reports and outperform in terms of language generation and clinical efficacy metrics.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the data bias problem in the automatic generation of radiology reports due to the high similarity between images and reports. This data bias may cause the automatic report - generation model to learn incorrect feature representations, resulting in misdiagnosis reports. To overcome these problems, the authors propose a new framework (CoFE) based on CounterFactual Explanations (CEs) for radiology report generation. Specifically: 1. **Problem Background**: - Radiology images and their corresponding reports show high similarity due to the commonality of anatomical structures. - This inherent data bias may cause the automatic report - generation model to learn entangled and spurious feature representations, and then generate inaccurate diagnostic reports. 2. **Solutions**: - **CounterFactual Explanations (CEs)**: By constructing "what - if" scenarios, CEs can be a powerful tool for understanding how algorithmic decisions change. - **Contrastive Learning**: CoFE learns non - spurious visual features by comparing the representations of real and counterfactual images. - **Counterfactual Image Generation**: Generate counterfactual images by swapping an image patch between positive and negative samples until the predicted diagnosis result changes. - **Learnable Prompts**: CoFE uses a learnable prompt to efficiently fine - tune a pre - trained large language model (LLM), containing factual and counterfactual content, to provide a more general prompt representation. 3. **Method Overview**: - **Encoder**: Use pre - trained ViT and PubMedBERT to encode images and text respectively. - **Decoder**: Use the pre - trained GPT - 2 Medium model to generate reports. - **Counterfactual Generation Module**: Generate counterfactual images through negative sample selection strategies and image patch swapping, and calculate the contrastive loss. - **Joint Optimization**: Train the model through the joint optimization of image - report contrastive loss, report generation loss, and counterfactual loss. 4. **Experimental Results**: - Experiments on two benchmark datasets, IU - Xray and MIMIC - CXR, show that CoFE outperforms other competing methods in both natural language generation (NLG) and clinical effectiveness metrics. - Especially on the MIMIC - CXR dataset, CoFE achieves the best performance in clinical effectiveness metrics such as Precision, Recall, and F1 - score. In conclusion, this paper effectively solves the data bias problem in the automatic generation of radiology reports by introducing counterfactual explanations and contrastive learning, and improves the accuracy and clinical applicability of the generated reports.