Soft Masked Mamba Diffusion Model for CT to MRI Conversion

Zhenbin Wang,Lei Zhang,Lituan Wang,Zhenwei Zhang

2024-06-23

Abstract:Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) are the predominant modalities utilized in the field of medical imaging. Although MRI capture the complexity of anatomical structures with greater detail than CT, it entails a higher financial costs and requires longer image acquisition times. In this study, we aim to train latent diffusion model for CT to MRI conversion, replacing the commonly-used U-Net or Transformer backbone with a State-Space Model (SSM) called Mamba that operates on latent patches. First, we noted critical oversights in the scan scheme of most Mamba-based vision methods, including inadequate attention to the spatial continuity of patch tokens and the lack of consideration for their varying importance to the target task. Secondly, extending from this insight, we introduce Diffusion Mamba (DiffMa), employing soft masked to integrate Cross-Sequence Attention into Mamba and conducting selective scan in a spiral manner. Lastly, extensive experiments demonstrate impressive performance by DiffMa in medical image generation tasks, with notable advantages in input scaling efficiency over existing benchmark models. The code and models are available at <a class="link-external link-https" href="https://github.com/wongzbb/DiffMa-Diffusion-Mamba" rel="external noopener nofollow">this https URL</a>

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper aims to address the problem of converting computed tomography (CT) images to magnetic resonance imaging (MRI) images. Although MRI is superior to CT in terms of anatomical detail, it is more expensive and has a longer imaging time. Therefore, by using image generation models to convert CT images to MRI images, the scope of diagnostic examinations can be expanded without increasing costs. The paper proposes a diffusion model based on the Mamba state-space model (State-Space Model) — Diffusion Mamba (DiffMa), for the CT to MRI conversion task. Unlike traditional U-Net or Transformer architectures, DiffMa uses the Mamba model to process latent patches. Specifically, the paper addresses key issues in existing Mamba vision methods, including insufficient spatial continuity of patch tokens and inadequate consideration of different importances in the target task. To overcome these issues, the paper introduces a Soft Mask mechanism and a Spiral-Scan scheme to enhance cross-sequence attention mechanisms and ensure the spatial continuity of the scanning sequence. Experimental results show that DiffMa performs excellently in medical image generation tasks, particularly surpassing existing benchmark models in terms of input scaling efficiency. Additionally, the paper provides a detailed comparison of DiffMa with other methods based on CNN, ViT, and recently proposed Mamba variants, demonstrating its significant performance improvement under the same number of iterations. Overall, DiffMa achieves the advantage of a global receptive field while maintaining linear complexity, achieving excellent results in the CT to MRI conversion task.

Soft Masked Mamba Diffusion Model for CT to MRI Conversion

Synthetic CT Generation from MRI using 3D Transformer-based Denoising Diffusion Model

MambaRecon: MRI Reconstruction with Structured State Space Models

MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain Segmentation

MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration

CT-Mamba: A Hybrid Convolutional State Space Model for Low-Dose CT Denoising

LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba

Conversion Between CT and MRI Images Using Diffusion and Score-Matching Models

MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation

U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation

DiffCMR: Fast Cardiac MRI Reconstruction with Diffusion Probabilistic Models

TC-DiffRecon: Texture coordination MRI reconstruction method based on diffusion model and modified MF-UNet method

DCT-net: Dual-domain cross-fusion transformer network for MRI reconstruction

MMR-Mamba: Multi-Modal MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion

MedMamba: Vision Mamba for Medical Image Classification

Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba

HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation

SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

Latent Diffusion Model for Medical Image Standardization and Enhancement

Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation