Abstract:The objective of multi-domain image-to-image translation is to learn the mapping from a source domain to a target domain in multiple image domains while preserving the content representation of the source domain. Despite the importance and recent efforts, most previous studies disregard the large style discrepancy between images and instances in various domains, or fail to capture instance details and boundaries properly, resulting in poor translation results for rich scenes. To address these problems, we present an effective architecture for multi-domain image-to-image translation that only requires one generator. Specifically, we provide detailed procedures for capturing the features of instances throughout the learning process, as well as learning the relationship between the style of the global image and that of a local instance in the image by enforcing the cross-granularity consistency. In order to capture local details within the content space, we employ a dual contrastive learning strategy that operates at both the instance and patch levels. Extensive studies on different multi-domain image-to-image translation datasets reveal that our proposed method outperforms state-of-the-art approaches.

What problem does this paper attempt to address?

This paper mainly explores the problem of Multi-Domain Image-to-Image Translation, which is a technique that learns the mapping from the source domain to the target domain among multiple image domains while preserving the content representation of the source domain. Existing methods often overlook the style differences between images and instances across different domains, or fail to accurately capture the instance details and boundaries, resulting in unsatisfactory translation effects for complex scenes. To address these issues, the paper proposes an effective architecture that uses only one generator. Specifically, they provide a detailed process to capture the features of instances throughout the learning process and learn the relationship between global image style and local instance style through enforced inter-scale consistency. In order to capture local details within the content space, they adopt a dual-contrastive learning strategy that operates at both the instance and patch levels. Experiments show that this approach outperforms existing state-of-the-art methods on various multi-domain image-to-image translation datasets. The contributions of the paper mainly include: 1. Introducing an inter-scale contrastive learning framework for high-quality multi-domain image-to-image translation. 2. Designing specific steps to incorporate instance features into the learning process and guide the learning relationship between instance style and image style through enforced inter-scale consistency. 3. Introducing multi-level instance-level and patch-level contrastive learning modules to preserve the local details of the original image or instances. 4. Validating the superiority of the proposed method through extensive qualitative and quantitative experiments, and demonstrating its performance on standard benchmarks. In addition, compared to other methods, their model only requires one generator for instance-aware mapping, simplifying the model structure and allowing certain shared features between instances and global images, making the generated instances easier to integrate into translated images.

Multi-Domain Image-to-Image Translation with Cross-Granularity Contrastive Learning

Image Cross-Domain Translation Algorithm Based on Self-Similarity and Contrastive Learning

Multi-Curve Translator for Real-Time High-Resolution Image-to-Image Translation

Unsupervised content and style learning for multimodal cross-domain image translation

DMDIT: Diverse multi-domain image-to-image translation

Image-to-Image Translation with Multi-Path Consistency Regularization

Unsupervised Multi-Domain Image Translation with Domain-Specific Encoders/Decoders

Unsupervised Multi-Domain Multimodal Image-to-image Translation with Explicit Domain-Constrained Disentanglement.

Multi-cropping Contrastive Learning and Domain Consistency for Unsupervised Image-to-Image Translation

Cross-domain image translation with a novel style-guided diversity loss design

Building Cross-Domain Mapping Chains from Multi-CycleGAN for Hyperspectral Image Classification

Multi-mapping Image-to-Image Translation Via Learning Disentanglement.

A one-to-many conditional generative adversarial network framework for multiple image-to-image translations

Multi-attention bidirectional contrastive learning method for unpaired image-to-image translation

Contrastive Learning with Attention Mechanism and Multi-Scale Sample Network for Unpaired Image-to-Image Translation

Progressive Energy-Based Cooperative Learning for Multi-Domain Image-to-Image Translation

Retrieval Guided Unsupervised Multi-domain Image-to-Image Translation

Style Image Harmonization Via Global-Local Style Mutual Guided

Learning Site-specific Styles for Multi-institutional Unsupervised Cross-modality Domain Adaptation

Incremental Learning of Multi-Domain Image-to-Image Translations

Accuracy of cast restorations produced by a refractory die-investing technique.