STEDNet: Swin transformer-based encoder-decoder network for noise reduction in low-dose CT
Linlin Zhu,Yu Han,Xiaoqi Xi,Huijuan Fu,Siyu Tan,Mengnan Liu,Shuangzhan Yang,Chang Liu,Lei Li,Bin Yan
DOI: https://doi.org/10.1002/mp.16249
Abstract:Background: Low-dose computed tomography (LDCT) can reduce the dose of X-ray radiation, making it increasingly significant for routine clinical diagnosis and treatment planning. However, the noise introduced by low-dose X-ray exposure degrades the quality of CT images, affecting the accuracy of clinical diagnosis. Purpose The noises, artifacts, and high-frequency components are similarly distributed in LDCT images. Transformer can capture global context information in an attentional manner to create distant dependencies on targets and extract more powerful features. In this paper, we reduce the impact of image errors on the ability to retain detailed information and improve the noise suppression performance by fully mining the distribution characteristics of image information. Methods: This paper proposed an LDCT noise and artifact suppressing network based on Swin Transformer. The network includes a noise extraction sub-network and a noise removal sub-network. The noise extraction and removal capability are improved using a coarse extraction network of high-frequency features based on full convolution. The noise removal sub-network improves the network's ability to extract relevant image features by using a Swin Transformer with a shift window as an encoder-decoder and skip connections for global feature fusion. Also, the perceptual field is extended by extracting multi-scale features of the images to recover the spatial resolution of the feature maps. The network uses a loss constraint with a combination of L1 and MS-SSIM to improve and ensure the stability and denoising effect of the network. Results: The denoising ability and clinical applicability of the methods were tested using clinical datasets. Compared with DnCNN, RED-CNN, CBDNet and TSCN, the STEDNet method shows a better denoising effect on RMSE and PSNR. The STEDNet method effectively removes image noise and preserves the image structure to the maximum extent, making the reconstructed image closest to the NDCT image. The subjective and objective analysis of several sets of experiments shows that the method in this paper can effectively maintain the structure, edges, and textures of the denoised images while having good noise suppression performance. In the real data evaluation, the RMSE of this method is reduced by 18.82%, 15.15%, 2.25%, and 1.10% on average compared with DnCNN, RED-CNN, CBDNet, and TSCNN, respectively. The average improvement of PSNR is 9.53%, 7.33%, 2.65%, and 3.69%, respectively. Conclusions: This paper proposed a LDCT image denoising algorithm based on end-to-end training. The method in this paper can effectively improve the diagnostic performance of CT images by constraining the details of the images and restoring the LDCT image structure. The problem of increased noise and artifacts in CT images can be solved while maintaining the integrity of CT image tissue structure and pathological information. Compared with other algorithms, this method has better denoising effects both quantitatively and qualitatively.