Learning to Summarize Chinese Radiology Findings With a Pre-Trained Encoder

Zuowei Jiang,Xiaoyan Cai,Libin Yang,Dehong Gao,Wei Zhao,Junwei Han,Jun Liu,Dinggang Shen,Tianming Liu
DOI: https://doi.org/10.1109/TBME.2023.3280987
Abstract:Automatic radiology report summarization has been an attractive research problem towards computer-aided diagnosis to alleviate physicians' workload in recent years. However, existing methods for English radiology report summarization using deep learning techniques cannot be directly applied to Chinese radiology reports due to limitations of the related corpus. In response to this, we propose an abstractive summarization approach for Chinese chest radiology report. Our approach involves the construction of a pre-training corpus using a Chinese medical-related pre-training dataset, and the collection of Chinese chest radiology reports from Department of Radiology at the Second Xiangya Hospital as the fine-tuning corpus. To improve the initialization of the encoder, we introduce a new task-oriented pre-training objective called Pseudo Summary Objective on the pre-training corpus. We then develop a Chinese pre-trained language model called Chinese medical BERT (CMBERT), which is used to initialize the encoder and fine-tuned on the abstractive summarization task. In testing our approach on a real large-scale hospital dataset, we observe that the performance of our proposed approach achieves outstanding improvement compared with other abstractive summarization models. This highlights the effectiveness of our approach in addressing the limitations of previous methods for Chinese radiology report summarization. Overall, our proposed approach demonstrates a promising direction for the automatic summarization of Chinese chest radiology reports, offering a viable solution to alleviate physicians' workload in the field of computer-aided diagnosis.
What problem does this paper attempt to address?