Asymmetric GANs for Image-to-Image Translation

Hao Tang,Nicu Sebe
2024-07-12
Abstract:Existing models for unsupervised image translation with Generative Adversarial Networks (GANs) can learn the mapping from the source domain to the target domain using a cycle-consistency loss. However, these methods always adopt a symmetric network architecture to learn both forward and backward cycles. Because of the task complexity and cycle input difference between the source and target domains, the inequality in bidirectional forward-backward cycle translations is significant and the amount of information between two domains is different. In this paper, we analyze the limitation of existing symmetric GANs in asymmetric translation tasks, and propose an AsymmetricGAN model with both translation and reconstruction generators of unequal sizes and different parameter-sharing strategy to adapt to the asymmetric need in both unsupervised and supervised image translation tasks. Moreover, the training stage of existing methods has the common problem of model collapse that degrades the quality of the generated images, thus we explore different optimization losses for better training of AsymmetricGAN, making image translation with higher consistency and better stability. Extensive experiments on both supervised and unsupervised generative tasks with 8 datasets show that AsymmetricGAN achieves superior model capacity and better generation performance compared with existing GANs. To the best of our knowledge, we are the first to investigate the asymmetric GAN structure on both unsupervised and supervised image translation tasks.
Computer Vision and Pattern Recognition,Machine Learning,Image and Video Processing
What problem does this paper attempt to address?
This paper primarily addresses the challenges in image-to-image translation tasks, especially utilizing Generative Adversarial Networks (GANs) in unsupervised and supervised image translation scenarios. The core contribution of the paper is the introduction of a novel Asymmetric Generative Adversarial Network (AsymmetricGAN), aimed at overcoming some limitations of existing methods. The main problems that the paper attempts to solve include: 1. **Limitations of existing GANs models**: Current unsupervised image translation models often use cycle consistency loss to learn the mapping from the source domain to the target domain, but these methods always adopt symmetric network architectures to learn the bidirectional cycle. However, in practical tasks, there is an asymmetry in complexity and the amount of information between the source and target domains, leading to inequality in bidirectional cycle translation, which poses challenges to the model's optimization and generalization capabilities. 2. **Model collapse issue**: Existing methods are prone to model collapse during training, which can reduce the quality of generated images. 3. **Efficiency issues in multi-domain image translation**: Multi-domain image translation needs to handle multiple different image domains. Existing methods like CycleGAN, DualGAN, etc., require training multiple models separately, while StarGAN, although only requiring one model, uses the same generator for both image translation and reconstruction, which limits the model's performance. To address the above issues, the key technical points proposed in the paper include: - **Asymmetric dual generator structure**: Two generators of different sizes and parameter sharing strategies were designed to accommodate the different needs of image translation and reconstruction tasks. The translation generator is responsible for converting images from the source domain to the target domain, while the reconstruction generator is tasked with reconstructing the original image from the converted image. - **Improved loss functions**: To enhance the model's stability and the consistency of generated images, the paper explores a variety of loss functions, including color cycle consistency loss, Multi-Scale Structural Similarity Loss (Multi-Scale SSIM Loss), Conditional Identity Preserving Loss, etc. - **Network optimization**: By introducing these improved loss functions, the training effectiveness of AsymmetricGAN is enhanced, thereby achieving more consistent and stable image translation results. In summary, this paper effectively addresses the issues present in existing unsupervised and supervised image translation tasks by introducing an asymmetric generator architecture and improved loss functions, thereby enhancing the model's performance.