2D medical image synthesis using transformer-based denoising diffusion probabilistic model
Shaoyan Pan,Tonghe Wang,Richard L J Qiu,Marian Axente,Chih-Wei Chang,Junbo Peng,Ashish B Patel,Joseph Shelton,Sagar A Patel,Justin Roper,Xiaofeng Yang
DOI: https://doi.org/10.1088/1361-6560/acca5c
IF: 3.5
2023-04-06
Physics in Medicine and Biology
Abstract:Objective: Artificial intelligence (AI) methods have gained popularity in medical imaging research. The size and scope of the training image datasets needed for successful AI model deployment does not always have the desired scale. In this paper, we introduce a medical image synthesis framework aimed at addressing the challenge of limited training datasets for AI models.
Approach: The proposed 2D image synthesis framework is based on a diffusion model using a Swin-transformer-based network. This model consists of a forward Gaussian noise process and a reverse process using the Transformer-based Diffusion model for denoising. Training data includes four image datasets: chest X-rays, heart MRI, pelvic CT, and abdomen CT. We evaluated the authenticity, quality, and diversity of the synthetic images using visual Turing assessments conducted by three medical physicists, and four quantitative evaluations: the Inception score (IS), Fréchet Inception Distance score (FID), feature similarity and diversity score (DS, indicating diversity similarity) between the synthetic and true images. To leverage the framework value for training AI models, we conducted COVID-19 classification tasks using real images, synthetic images, and mixtures of both images.
Main results: Visual Turing assessments showed an average accuracy of 0.64 (accuracy converging to 50% indicates a better realistic visual appearance of the synthetic images), sensitivity of 0.79, and specificity of 0.50. Average quantitative accuracy obtained from all datasets were IS=2.28, FID=37.27, FDS=0.20, and DS=0.86. For the COVID-19 classification task, the baseline network obtained an accuracy of 0.88 using a pure real dataset, 0.89 using a pure synthetic dataset, and 0.93 using a dataset mixed of real and synthetic data.
Significance: A image synthesis framework was demonstrated for medical image synthesis, which can generate high-quality medical images of different imaging modalities with the purpose of supplementing existing training sets for AI model deployment. This method has potential applications in many data-driven medical imaging research.
engineering, biomedical,radiology, nuclear medicine & medical imaging