Abstract:Medical Image-to-image translation is a key task in computer vision and generative artificial intelligence, and it is highly applicable to medical image analysis. GAN-based methods are the mainstream image translation methods, but they often ignore the variation and distribution of images in the frequency domain, or only take simple measures to align high-frequency information, which can lead to distortion and low quality of the generated images. To solve these problems, we propose a novel method called frequency domain decomposition translation (FDDT). This method decomposes the original image into a high-frequency component and a low-frequency component, with the high-frequency component containing the details and identity information, and the low-frequency component containing the style information. Next, the high-frequency and low-frequency components of the transformed image are aligned with the transformed results of the high-frequency and low-frequency components of the original image in the same frequency band in the spatial domain, thus preserving the identity information of the image while destroying as little stylistic information of the image as possible. We conduct extensive experiments on MRI images and natural images with FDDT and several mainstream baseline models, and we use four evaluation metrics to assess the quality of the generated images. Compared with the baseline models, optimally, FDDT can reduce Fréchet inception distance by up to 24.4%, structural similarity by up to 4.4%, peak signal-to-noise ratio by up to 5.8%, and mean squared error by up to 31%. Compared with the previous method, optimally, FDDT can reduce Fréchet inception distance by up to 23.7%, structural similarity by up to 1.8%, peak signal-to-noise ratio by up to 6.8%, and mean squared error by up to 31.6%.

What problem does this paper attempt to address?

The paper primarily aims to address the issues encountered when using Generative Adversarial Networks (GANs) in medical image translation tasks, specifically including: 1. **Insufficient handling of frequency domain information**: Existing GAN-based methods often overlook the changes and distribution of images in the frequency domain or only take simple measures to align high-frequency information, which may result in lower quality and distorted generated images. 2. **Image distortion caused by inconsistent frequency components**: Current methods use L1 loss to align the frequency domain components of the source and generated images when they are not equal, which can cause distortion in the generated images and does not meet the high precision requirements of medical image translation tasks. 3. **Unconstrained translation process of low-frequency components**: Existing methods do not constrain the translation process of low-frequency components, making the frequency domain constraints incomplete. 4. **Issues arising from reliance on dual-input architecture**: Existing methods like FDIT are based on a dual-input single-output (DISO) architecture, requiring two types of input data (source image and reference image), which increases the cost of clinical sample acquisition. Moreover, this method is not suitable for single-input single-output (SISO) architecture, and the generated images may be biased due to the influence of the reference input. To address the above challenges, the paper proposes a new method called "Frequency Domain Decomposition Translation" (FDDT). The main contributions of FDDT can be summarized as: - **Comprehensive handling of high and low-frequency information**: FDDT not only focuses on high-frequency details and identity information but also considers low-frequency style information, handling the high and low-frequency information of images more comprehensively without increasing the model size. - **Improved image quality**: Experimental results show that applying FDDT to benchmark models (such as CycleGAN, Pix2Pix, and UNIT) can significantly improve the quality of generated images, including multiple evaluation metrics such as Fréchet inception distance (FID), peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and mean squared error (MSE). - **Flexibility and generality**: FDDT can be flexibly combined with mainstream image-to-image translation frameworks without increasing the model inference cost; experiments on natural image datasets also validate its effectiveness, demonstrating the method's generality. Through these improvements, FDDT not only enhances the quality of generated images but also maintains computational efficiency, providing strong support for medical image translation.

Frequency Domain Decomposition Translation for Enhanced Medical Image Translation Using GANs

Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving

FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model

Optimizing the Quality of Fourier Single-Pixel Imaging Via Generative Adversarial Network

MedGAN: Medical Image Translation using GANs

Zero-shot Medical Image Translation via Frequency-Guided Diffusion Models

fRegGAN with K-space Loss Regularization for Medical Image Translation

AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation

Uncertainty-Guided Progressive GANs for Medical Image Translation

Segmentation-Renormalized Deep Feature Modulation for Unpaired Image Harmonization

Memory-efficient GAN-based domain translation of high resolution 3D medical images

UGC: Unified GAN Compression for Efficient Image-to-Image Translation

Ambient-Pix2PixGAN for Translating Medical Images from Noisy Data

GAN-GA: A Generative Model based on Genetic Algorithm for Medical Image Generation

TarGAN: Target-Aware Generative Adversarial Networks for Multi-modality Medical Image Translation

IFGAN: Pre- to Post-Contrast Medical Image Synthesis Based on Interactive Frequency GAN

MRI to PET Cross-Modality Translation using Globally and Locally Aware GAN (GLA-GAN) for Multi-Modal Diagnosis of Alzheimer's Disease

ReeGAN: MRI image edge-preserving synthesis based on GANs trained with misaligned data

Bridging the gap between paired and unpaired medical image translation

FDG-PET to T1 Weighted MRI Translation with 3D Elicit Generative Adversarial Network (E-GAN)

Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation