TranSQ: Transformer-Based Semantic Query for Medical Report Generation

Ming Kong,Zhengxing Huang,Kun Kuang,Qiang Zhu,Fei Wu
DOI: https://doi.org/10.1007/978-3-031-16452-1_58
2022-01-01
Abstract:Medical report generation, which aims at automatically generating coherent reports with multiple sentences for the given medical images, has received growing research interest due to its tremendous potential in facilitating clinical workflow and improving health services. Due to the highly patterned nature of medical reports, each sentence can be viewed as the description of an image observation with a specific purpose. To this end, this study proposes a novel Transformer-based Semantic Query (TranSQ) model that treats the medical report generation as a direct set prediction problem. Specifically, our model generates a set of semantic features to match plausible clinical concerns and compose the report with sentence retrieval and selection. Experimental results on two prevailing radiology report datasets, i.e., IU X-Ray and MIMIC-CXR, demonstrate that our model outperforms state-of-the-art models on the generation task in terms of both language generation effectiveness and clinical efficacy, which highlights the utility of our approach in generating medical reports with topics of clinical concern as well as sentence-level visual-semantic attention mappings. The source code is available at https://github.com/zjukongming/TranSQ.
What problem does this paper attempt to address?