Abstract:In this paper, we introduced a novel deep learning-based reconstruction technique for low-dose CT imaging using 3 dimensional convolutions to include the sagittal information unlike the existing 2 dimensional networks which exploits correlation only in transverse plane. In the proposed reconstruction technique, sparse and noisy sinograms are back-projected to the image domain with FBP operation, then the denoising process is applied with a U-Net like 3-dimensional network called 3D U-NetR. The proposed network is trained with synthetic and real chest CT images, and 2D U-Net is also trained with the same dataset to show the importance of the third dimension in terms of recovering the fine details. The proposed network shows better quantitative performance on SSIM and PSNR, especially in the real chest CT data. More importantly, 3D U-NetR captures medically critical visual details that cannot be visualized by a 2D network on the reconstruction of real CT images with 1/10 of the normal dose.
What problem does this paper attempt to address?
This paper attempts to solve the problem of image reconstruction in low - dose computed tomography (CT) imaging. Specifically, the authors propose a new reconstruction technique based on deep learning, which uses three - dimensional convolution to include sagittal plane information, different from the existing two - dimensional networks that only use correlations in the cross - sectional plane.
### Problem Background
Since its discovery in the 20th century, X - ray computed tomography (CT) has played an important role in medicine. However, CT imaging is inevitably accompanied by ionizing radiation, which may lead to cancer. To reduce the radiation dose, it can be achieved by reducing the number of projections or decreasing the tube current, but this will lead to a decline in the quality of the reconstructed image, forming an ill - posed problem.
### Existing Methods and Their Limitations
Traditional CT image reconstruction techniques such as filtered back - projection (FBP) can provide sufficient results under full - dose CT, but they are not effective in low - dose cases. Iterative techniques and regularization methods can improve the image quality, but they still have limitations. In addition, the existing deep - learning - based methods mainly focus on two - dimensional networks, and these methods are unable to capture the cross - slice correlations, resulting in the loss of some details.
### Proposed Solution
To solve the above problems, the authors propose the 3D U - NetR architecture, which is a deep - learning model using three - dimensional convolution and the U - Net structure. The main features of this model include:
1. **Three - Dimensional Convolution**: Use three - dimensional convolution to capture cross - slice correlations, so as to better restore small but medically critical details.
2. **FBP Pre - processing**: First, use FBP to back - project the sparse and noisy sinogram into the image domain, and then perform denoising processing through 3D U - NetR.
3. **Dataset Training**: The model is trained using synthetic and real chest CT images, and a comparison experiment with 2D U - Net is carried out to prove the importance of the third dimension.
### Experimental Results
The experimental results show that 3D U - NetR performs better in quantitative indicators such as SSIM and PSNR, especially on real chest CT data. More importantly, 3D U - NetR can capture some medically critical visual details that cannot be visualized by 2D networks, especially in low - dose CT image reconstruction.
### Conclusion
By introducing three - dimensional convolution and the U - Net structure, 3D U - NetR shows significant advantages in low - dose CT image reconstruction. It not only improves the image quality but also enhances the ability to capture medically critical details. This method provides a new solution for low - dose CT imaging, which helps to reduce the radiation dose received by patients while maintaining high - quality image reconstruction.
### Formula Summary
The formulas involved in this paper are as follows:
- The linear inverse problem representation of CT reconstruction problem:
\[
y = Ax+\eta
\]
where \(A\in\mathbb{R}^{k\times l}\) is the forward operator, \(x\in\mathbb{R}^l\) is the vector form of the real CT image, \(y\in\mathbb{R}^k\) is the vector form of the sinogram, and \(\eta\in\mathbb{R}^k\) is the system noise.
- The optimization problem definition of the image - to - image reconstruction method:
\[
\hat{w}=\arg\min_w\|f_w(X) - Y\|_2
\]
where \(X\) and \(Y\) represent the sparse or noisy CT image and the real CT image respectively, \(w\) is the model parameter, and \(f_w\) represents the nonlinear reconstruction function.
- The definition of root - mean - square error (RMSE):
\[
\text{RMSE}=\sqrt{\sum_i\|\hat{y}_i - y_i\|^2}
\]
- The definition of peak signal - to - noise ratio (PSNR):
\[
\text{PSNR}=20\log_{10}\left(\frac{\text{MAX}_i}{\text{RMSE}}\right)
\]
- The definition of structural similarity (SSIM):
\[
\text{SSIM}(x, y)=\f