Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation

Yinchi Zhou,Tianqi Chen,Jun Hou,Huidong Xie,Nicha C. Dvornek,S. Kevin Zhou,David L. Wilson,James S. Duncan,Chi Liu,Bo Zhou

2024-08-14

Abstract:Image-to-image translation is a vital component in medical imaging processing, with many uses in a wide range of imaging modalities and clinical scenarios. Previous methods include Generative Adversarial Networks (GANs) and Diffusion Models (DMs), which offer realism but suffer from instability and lack uncertainty estimation. Even though both GAN and DM methods have individually exhibited their capability in medical image translation tasks, the potential of combining a GAN and DM to further improve translation performance and to enable uncertainty estimation remains largely unexplored. In this work, we address these challenges by proposing a Cascade Multi-path Shortcut Diffusion Model (CMDM) for high-quality medical image translation and uncertainty estimation. To reduce the required number of iterations and ensure robust performance, our method first obtains a conditional GAN-generated prior image that will be used for the efficient reverse translation with a DM in the subsequent step. Additionally, a multi-path shortcut diffusion strategy is employed to refine translation results and estimate uncertainty. A cascaded pipeline further enhances translation quality, incorporating residual averaging between cascades. We collected three different medical image datasets with two sub-tasks for each dataset to test the generalizability of our approach. Our experimental results found that CMDM can produce high-quality translations comparable to state-of-the-art methods while providing reasonable uncertainty estimations that correlate well with the translation error.

Image and Video Processing,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper attempts to address the problem of how to combine Generative Adversarial Networks (GAN) and Diffusion Models (DM) in medical image translation tasks to achieve high-quality image translation and provide uncertainty estimation. Specifically, existing methods such as GAN and Diffusion Models, although performing well in medical image translation tasks, have some shortcomings: 1. **Limitations of GAN**: - Unstable training, requiring a balance between the optimization of the generator and the discriminator. - Lack of uncertainty estimation. 2. **Limitations of Diffusion Models**: - Require a large number of iterative steps, leading to slow processing speed. - Although they can provide uncertainty estimation, they result in inconsistent outcomes under different noise initializations. To address these issues, the paper proposes a Cascade Multi-path Shortcut Diffusion Model (CMDM). This method improves existing techniques through the following points: 1. **Shortcut Strategy**: Using prior images generated by Conditional Generative Adversarial Networks (cGAN) as the starting point of the diffusion process, thereby reducing the required number of iterations and improving the consistency and robustness of the translation. 2. **Multi-path Strategy**: Performing multiple shortcut backpropagations with different noises, then averaging the results of multiple paths to further optimize translation quality and estimate uncertainty. 3. **Cascade System**: Adopting a cascade framework, using a residual averaging strategy between each cascade to further refine the translation results. Experimental results show that CMDM achieves higher performance than existing methods in various medical image translation tasks and can provide reasonable uncertainty estimation.

Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation

Cross-conditioned Diffusion Model for Medical Image to Image Translation

FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

Unsupervised Medical Image Translation with Adversarial Diffusion Models

Zero-shot Medical Image Translation via Frequency-Guided Diffusion Models

Adaptive Latent Diffusion Model for 3D Medical Image to Image Translation: Multi-modal Magnetic Resonance Imaging Study

Cross-Domain Medical Image Translation by Shared Latent Gaussian Mixture Model

Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning

MedGAN: Medical Image Translation using GANs

2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction

Uncertainty-Guided Progressive GANs for Medical Image Translation

Mutual Information Guided Diffusion for Zero-Shot Cross-Modality Medical Image Translation

GH-DDM: the generalized hybrid denoising diffusion model for medical image generation

Reliable Multi-modal Medical Image-to-image Translation Independent of Pixel-wise Aligned Data

Med-cDiff: Conditional Medical Image Generation with Diffusion Models

Self-Consistent Recursive Diffusion Bridge for Medical Image Translation

MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant

Reliable multi‐modal medical image‐to‐image translation independent of pixel‐wise aligned data

ContourDiff: Unpaired Image-to-Image Translation with Structural Consistency for Medical Imaging

TarGAN: Target-Aware Generative Adversarial Networks for Multi-modality Medical Image Translation

Unsupervised Medical Image Translation Using Cycle-MedGAN