High-resolution 3T to 7T MRI Synthesis with a Hybrid CNN-Transformer Model

Zach Eidex,Jing Wang,Mojtaba Safari,Eric Elder,Jacob Wynne,Tonghe Wang,Hui-Kuo Shu,Hui Mao,Xiaofeng Yang
2023-11-25
Abstract:7 Tesla (7T) apparent diffusion coefficient (ADC) maps derived from diffusion-weighted imaging (DWI) demonstrate improved image quality and spatial resolution over 3 Tesla (3T) ADC maps. However, 7T magnetic resonance imaging (MRI) currently suffers from limited clinical unavailability, higher cost, and increased susceptibility to artifacts. To address these issues, we propose a hybrid CNN-transformer model to synthesize high-resolution 7T ADC maps from multi-modal 3T MRI. The Vision CNN-Transformer (VCT), composed of both Vision Transformer (ViT) blocks and convolutional layers, is proposed to produce high-resolution synthetic 7T ADC maps from 3T ADC maps and 3T T1-weighted (T1w) MRI. ViT blocks enabled global image context while convolutional layers efficiently captured fine detail. The VCT model was validated on the publicly available Human Connectome Project Young Adult dataset, comprising 3T T1w, 3T DWI, and 7T DWI brain scans. The Diffusion Imaging in the Python library was used to compute ADC maps from the DWI scans. A total of 171 patient cases were randomly divided: 130 training cases, 20 validation cases, and 21 test cases. The synthetic ADC maps were evaluated by comparing their similarity to the ground truth volumes with the following metrics: peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and mean squared error (MSE). The results are as follows: PSNR: 27.0+-0.9 dB, SSIM: 0.945+-0.010, and MSE: 2.0+-0.4E-3. Our predicted images demonstrate better spatial resolution and contrast compared to 3T MRI and prediction results made by ResViT and pix2pix. These high-quality synthetic 7T MR images could be beneficial for disease diagnosis and intervention, especially when 7T MRI scanners are unavailable.
Medical Physics
What problem does this paper attempt to address?
The problem this paper attempts to address is: how to generate high-quality 7 Tesla (7T) Apparent Diffusion Coefficient (ADC) maps from existing 3 Tesla (3T) Magnetic Resonance Imaging (MRI) data. Specifically, while 7T MRI is superior to 3T MRI in terms of image quality and spatial resolution, its clinical application is limited due to high costs, equipment scarcity, and susceptibility to artifacts. Therefore, the researchers propose a hybrid CNN-Transformer model (Vision CNN-Transformer, VCT) to synthesize high-resolution 7T ADC maps from multimodal 3T MRI data (including 3T T1-weighted images and 3T ADC maps). ### Main Issues 1. **Limited clinical availability of 7T MRI**: 7T MRI equipment is expensive and not widely available. 2. **7T MRI is prone to artifacts**: 7T MRI is more sensitive to magnetic field inhomogeneity and susceptibility artifacts, which can degrade image quality. 3. **Improving image quality and contrast**: By synthesizing 7T ADC maps, higher image quality and contrast can be achieved without the need for 7T MRI equipment, thereby improving disease diagnosis and intervention. ### Solution - **Model Design**: A hybrid CNN-Transformer model (VCT) is proposed, combining the advantages of Convolutional Neural Networks (CNN) and Vision Transformers (ViT), capable of capturing local details and handling global context. - **Dataset**: The publicly available Human Connectome Project Young Adult dataset is used for validation, which includes 3T T1-weighted images, 3T DWI, and 7T DWI brain scans. - **Evaluation Metrics**: The similarity between the synthesized 7T ADC maps and the real 7T ADC maps is evaluated using metrics such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Mean Squared Error (MSE). ### Experimental Results - **Performance Improvement**: The VCT model outperforms existing ResViT and pix2pix models on all evaluation metrics, particularly excelling in PSNR and SSIM. - **Image Quality**: The synthesized 7T ADC maps show significantly better spatial resolution and contrast compared to 3T ADC maps and effectively remove skull artifacts. ### Conclusion The VCT model proposed in this study performs excellently in synthesizing high-quality 7T ADC maps from 3T MRI data, potentially enhancing the diagnostic capabilities of existing 3T MRI equipment in the absence of 7T MRI devices. This provides new tools and methods for disease diagnosis and treatment.