Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation

Yinchi Zhou,Tianqi Chen,Jun Hou,Huidong Xie,Nicha C. Dvornek,S. Kevin Zhou,David L. Wilson,James S. Duncan,Chi Liu,Bo Zhou
2024-08-14
Abstract:Image-to-image translation is a vital component in medical imaging processing, with many uses in a wide range of imaging modalities and clinical scenarios. Previous methods include Generative Adversarial Networks (GANs) and Diffusion Models (DMs), which offer realism but suffer from instability and lack uncertainty estimation. Even though both GAN and DM methods have individually exhibited their capability in medical image translation tasks, the potential of combining a GAN and DM to further improve translation performance and to enable uncertainty estimation remains largely unexplored. In this work, we address these challenges by proposing a Cascade Multi-path Shortcut Diffusion Model (CMDM) for high-quality medical image translation and uncertainty estimation. To reduce the required number of iterations and ensure robust performance, our method first obtains a conditional GAN-generated prior image that will be used for the efficient reverse translation with a DM in the subsequent step. Additionally, a multi-path shortcut diffusion strategy is employed to refine translation results and estimate uncertainty. A cascaded pipeline further enhances translation quality, incorporating residual averaging between cascades. We collected three different medical image datasets with two sub-tasks for each dataset to test the generalizability of our approach. Our experimental results found that CMDM can produce high-quality translations comparable to state-of-the-art methods while providing reasonable uncertainty estimations that correlate well with the translation error.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the problem of how to combine Generative Adversarial Networks (GAN) and Diffusion Models (DM) in medical image translation tasks to achieve high-quality image translation and provide uncertainty estimation. Specifically, existing methods such as GAN and Diffusion Models, although performing well in medical image translation tasks, have some shortcomings: 1. **Limitations of GAN**: - Unstable training, requiring a balance between the optimization of the generator and the discriminator. - Lack of uncertainty estimation. 2. **Limitations of Diffusion Models**: - Require a large number of iterative steps, leading to slow processing speed. - Although they can provide uncertainty estimation, they result in inconsistent outcomes under different noise initializations. To address these issues, the paper proposes a Cascade Multi-path Shortcut Diffusion Model (CMDM). This method improves existing techniques through the following points: 1. **Shortcut Strategy**: Using prior images generated by Conditional Generative Adversarial Networks (cGAN) as the starting point of the diffusion process, thereby reducing the required number of iterations and improving the consistency and robustness of the translation. 2. **Multi-path Strategy**: Performing multiple shortcut backpropagations with different noises, then averaging the results of multiple paths to further optimize translation quality and estimate uncertainty. 3. **Cascade System**: Adopting a cascade framework, using a residual averaging strategy between each cascade to further refine the translation results. Experimental results show that CMDM achieves higher performance than existing methods in various medical image translation tasks and can provide reasonable uncertainty estimation.