CoreDiff: Contextual Error-Modulated Generalized Diffusion Model for Low-Dose CT Denoising and Generalization

Qi Gao,Zilong Li,Junping Zhang,Yi Zhang,Hongming Shan
DOI: https://doi.org/10.1109/TMI.2023.3320812
2023-10-06
Abstract:Low-dose computed tomography (CT) images suffer from noise and artifacts due to photon starvation and electronic noise. Recently, some works have attempted to use diffusion models to address the over-smoothness and training instability encountered by previous deep-learning-based denoising models. However, diffusion models suffer from long inference times due to the large number of sampling steps involved. Very recently, cold diffusion model generalizes classical diffusion models and has greater flexibility. Inspired by the cold diffusion, this paper presents a novel COntextual eRror-modulated gEneralized Diffusion model for low-dose CT (LDCT) denoising, termed CoreDiff. First, CoreDiff utilizes LDCT images to displace the random Gaussian noise and employs a novel mean-preserving degradation operator to mimic the physical process of CT degradation, significantly reducing sampling steps thanks to the informative LDCT images as the starting point of the sampling process. Second, to alleviate the error accumulation problem caused by the imperfect restoration operator in the sampling process, we propose a novel ContextuaL Error-modulAted Restoration Network (CLEAR-Net), which can leverage contextual information to constrain the sampling process from structural distortion and modulate time step embedding features for better alignment with the input at the next time step. Third, to rapidly generalize to a new, unseen dose level with as few resources as possible, we devise a one-shot learning framework to make CoreDiff generalize faster and better using only a single LDCT image (un)paired with NDCT. Extensive experimental results on two datasets demonstrate that our CoreDiff outperforms competing methods in denoising and generalization performance, with a clinically acceptable inference time. Source code is made available at <a class="link-external link-https" href="https://github.com/qgao21/CoreDiff" rel="external noopener nofollow">this https URL</a>.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning,Medical Physics
What problem does this paper attempt to address?
The paper aims to address the noise and artifacts present in Low-dose Computed Tomography (LDCT) images. Specifically, the paper proposes a novel Contextual Error-Modulated Generalized Diffusion Model, abbreviated as CoreDiff, for LDCT image denoising. Traditional deep learning denoising models often lead to excessive image smoothing, while diffusion models, although better at preserving details, have longer inference times. To solve these issues, the main contributions of the paper are as follows: 1. **Proposed a new generalized diffusion model**: Unlike traditional diffusion processes based on Gaussian noise, CoreDiff uses the LDCT image itself as the endpoint of the diffusion process and introduces a mean-preserving degradation operator to simulate the actual degradation process of CT images. This method can significantly reduce the number of sampling steps, thereby improving efficiency. 2. **Introduced a new restoration network (CLEAR-Net)**: To mitigate the error accumulation problem caused by imperfect restoration operators, the authors designed a Contextual Error-Modulated Restoration Network (CLEAR-Net). This network can utilize contextual information from adjacent slices to constrain structural distortions during the sampling process and correct the alignment issues between the input image and time-step embedded features through an Error Modulation Module (EMM). 3. **Developed a single-sample learning framework**: To enable the trained model to quickly adapt to new, unseen dose levels, the paper proposes a single-sample learning framework. This framework requires only one LDCT image (paired or unpaired with a normal-dose CT image) to achieve rapid model generalization. Experimental results show that CoreDiff outperforms existing denoising methods on four different datasets, demonstrating excellent denoising performance and generalization ability, with clinically acceptable inference time (0.12 seconds per slice).