Report-Concept Textual-Prompt Learning for Enhancing X-ray Diagnosis

Xiongjun Zhao,Zhengyu Liu,Fen Liu,Guanting Li,Yutao Dou,Shaoliang Peng
DOI: https://doi.org/10.1145/3664647.3681568
2024-01-01
Abstract:Despite significant advances in image-text medical visual language modeling, the high cost of fine-grained annotation of images to align radiology reports has led current approaches to focus primarily on semantic alignment between the image and the full report, neglecting the critical diagnostic information contained in the text. This is insufficient in medical scenarios demanding high explainability. To address this problem, in this paper, we introduce radiology reports as images in prompt learning. Specifically, we extract key clinical concepts, lesion locations, and positive labels from easily accessible radiology reports and combine them with an external medical knowledge base to form fine-grained self-supervised signals. Moreover, we propose a novel Report-Concept Textual-Prompt Learning ( RC-TPL ), which aligns radiology reports at multiple levels. In the inference phase, the report-level and concept-level prompts provide rich global and local semantic understanding for X-ray images. Extensive experiments on X-ray image datasets demonstrate the superior performance of our approach with respect to various baselines, especially in the presence of scarce imaging data. Our study not only significantly improves the accuracy of data-constrained medical X-ray diagnosis, but also demonstrates how the integration of domain-specific conceptual knowledge can enhance the explainability of medical image analysis.
What problem does this paper attempt to address?