Few-Shot Radiology Report Generation via Knowledge Transfer and Multi-modal Alignment.

Xing Jia,Yun Xiong,Jiawei Zhang,Yao Zhang,Yangyong Zhu,Philip S. Yu
DOI: https://doi.org/10.1109/BIBM55620.2022.9995533
2022-01-01
Abstract:Automatic radiology report generation aims at generating informative text from the given medical image, which could assist diagnosis and lighten the workload of radiologists. While some models have been proposed to study on this task, few of them paid attention to the radiology report generation for rare diseases, except for RareGen which solved this problem by enhancing the semantic representations of rare diseases. However, there still exist several problems to be addressed. The first lies in that modeling the correlations among diseases by current studies can result in the problem of frequency bias, which can affect the detection of rare diseases. The second lies in that how to get better representations of disease regions, so as to benefit their corresponding report generation in the decoding stage. To tackle these challenges, we propose a new few-shot radiology report generation model, namely FS-Gen. FS-Gen is assembled with one module for more effective detection of rare diseases in the encoding stage, and the other module for the better representation generation of disease regions in the decoding stage. Specifically, in the encoding stage, a cascade visual enhancement module is proposed to strengthen the correlations among diseases, without incurring the problem of frequency bias. On the other hand, in the decoding stage, a co-referential aligned topic generation module is introduced to simultaneously capture the location and semantic information of disease regions, by aligning the multimodal representations. Extensive experiments are conducted on real-world medical image datasets to demonstrate the effectiveness of our model.
What problem does this paper attempt to address?