Similarity-aware Syncretic Latent Diffusion Model for Medical Image Translation with Representation Learning

Tingyi Lin,Pengju Lyu,Jie Zhang,Yuqing Wang,Cheng Wang,Jianjun Zhu
2024-06-20
Abstract:Non-contrast CT (NCCT) imaging may reduce image contrast and anatomical visibility, potentially increasing diagnostic uncertainty. In contrast, contrast-enhanced CT (CECT) facilitates the observation of regions of interest (ROI). Leading generative models, especially the conditional diffusion model, demonstrate remarkable capabilities in medical image modality transformation. Typical conditional diffusion models commonly generate images with guidance of segmentation labels for medical modal transformation. Limited access to authentic guidance and its low cardinality can pose challenges to the practical clinical application of conditional diffusion models. To achieve an equilibrium of generative quality and clinical practices, we propose a novel Syncretic generative model based on the latent diffusion model for medical image translation (S$^2$LDM), which can realize high-fidelity reconstruction without demand of additional condition during inference. S$^2$LDM enhances the similarity in distinct modal images via syncretic encoding and diffusing, promoting amalgamated information in the latent space and generating medical images with more details in contrast-enhanced regions. However, syncretic latent spaces in the frequency domain tend to favor lower frequencies, commonly locate in identical anatomic structures. Thus, S$^2$LDM applies adaptive similarity loss and dynamic similarity to guide the generation and supplements the shortfall in high-frequency details throughout the training process. Quantitative experiments confirm the effectiveness of our approach in medical image translation. Our code will release lately.
Image and Video Processing,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issue of low contrast and anatomical visibility in non-contrast-enhanced CT (NCCT) images in clinical diagnosis, which increases diagnostic uncertainty. In contrast, contrast-enhanced CT (CECT) helps in observing regions of interest (ROI). However, CECT relies on expensive iodinated contrast agents (ICAs), which can pose risks to patients with iodine allergies or renal insufficiency, limiting its clinical application. Therefore, the goal of this study is to generate CECT-like images from NCCT images to reduce dependence on ICAs and provide more precise pathological details. To achieve this goal, the authors propose a novel synthesis model—Medical Image Translation Model based on Latent Diffusion Model (S2LDM), which can achieve high-fidelity reconstruction without additional conditions during inference. S2LDM enhances the similarity between different modality images through integrated encoding and diffusion, fuses information in the latent space, and generates more detailed contrast-enhanced region images. Additionally, to supplement the lack of high-frequency details throughout the training process, S2LDM applies adaptive similarity loss and dynamic similarity to guide the generation. Quantitative experiments have validated the effectiveness of this method in medical image translation.