Enhancing CT Image synthesis from multi-modal MRI data based on a multi-task neural network framework

Zhuoyao Xin,Christopher Wu,Dong Liu,Chunming Gu,Jia Guo,Jun Hua
2023-12-18
Abstract:Image segmentation, real-value prediction, and cross-modal translation are critical challenges in medical imaging. In this study, we propose a versatile multi-task neural network framework, based on an enhanced Transformer U-Net architecture, capable of simultaneously, selectively, and adaptively addressing these medical image tasks. Validation is performed on a public repository of human brain MR and CT images. We decompose the traditional problem of synthesizing CT images into distinct subtasks, which include skull segmentation, Hounsfield unit (HU) value prediction, and image sequential reconstruction. To enhance the framework's versatility in handling multi-modal data, we expand the model with multiple image channels. Comparisons between synthesized CT images derived from T1-weighted and T2-Flair images were conducted, evaluating the model's capability to integrate multi-modal information from both morphological and pixel value perspectives.
Image and Video Processing,Computer Vision and Pattern Recognition,Quantitative Methods
What problem does this paper attempt to address?
The paper aims to address key challenges in medical imaging such as image segmentation, ground truth prediction, and cross-modal translation. The specific goal is to synthesize high-quality CT images from multi-modal MRI data through a multi-task neural network framework. The research mainly addresses the following issues: 1. **Decomposition of Subtasks for Synthesizing CT Images**: The traditional task of converting MRI to CT images is decomposed into multiple subtasks, including skull segmentation, prediction of Hounsfield Unit (HU) values within the region of interest, and image sequence reconstruction. 2. **Multi-modal Information Fusion**: Dual-modal (T1-weighted and T2-FLAIR) MRI data are used to generate synthetic CT images, and the model's performance is evaluated in terms of morphological and pixel value predictions. 3. **3D Patch Extraction and Continuous Image Reconstruction**: A 3D patch extraction method is employed to handle large image data, maintain 3D structural continuity, enhance the model's generalization ability, and avoid the loss of inter-slice continuity present in traditional 2D or pseudo-3D methods. 4. **Performance Evaluation**: The model's performance on subtasks and the main conversion task is comprehensively evaluated using various metrics (such as SSIM, Pearson correlation coefficient, Spearman correlation coefficient, MAE, etc.), particularly focusing on differences in morphology and pixel levels. The research results show that the dual-modal model significantly outperforms the single-modal model in terms of pixel value prediction and image correlation, but the improvement in morphological segmentation tasks is relatively small. Overall, the framework demonstrates strong performance in handling multi-modal medical imaging data.