Comparison of CNN-based and Transformer-based Approaches for Sparse-view CT Reconstruction

Changrong Shi,Yongshun Xiao
DOI: https://doi.org/10.1109/nss/mic44845.2022.10399215
2022-01-01
Abstract:Sparse-view Computed Tomography (CT) is an effective method to decrease X-ray ionizing radiation dose by reducing the scanning time. However, sparse-view CT leads to severe artifacts in reconstructed images due to insufficient projection data. In order to improve the quality of sparse-view CT images, a wide variety of approaches based on deep learning have been proposed, including using convolutional neural networks (CNN) as pre-processing or post-processing steps. However, it is of difficulty for convolution to deal with long-range dependent information. Recently, Transformer, which is designed to capture global information, has shown promising performance in computer vision. In this study, we compared CNN-based methods including a pre-processing method SIUNet and a post-processing method FBPConvNet, with SwinIR which is based on Transformer for sparse-view CT. To further prove the effectiveness of SwinIR, we also proposed a dual-domain method which combined SIUNet with FBPConvNet for comparison. These approaches were trained and tested on simulated sparse-view data generated from "2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge" datasets with different numbers of projection views and noise levels. Experiment results showed that, compared with three CNN-based methods, SwinIR could reduce more noise artifacts and showed better detail recovery on various scenarios. Further, SwinIR improved average PSNR on testing datasets by 1∼2dB compared with FBPConvNet with only about one-third of parameters.
What problem does this paper attempt to address?