IIT-GAT: Instance-level Image Transformation Via Unsupervised Generative Attention Networks with Disentangled Representations

Mingwen Shao,Youcai Zhang,Yuan Fan,Wangmeng Zuo,Deyu Meng
DOI: https://doi.org/10.1016/j.knosys.2021.107122
IF: 8.139
2021-01-01
Knowledge-Based Systems
Abstract:Image-to-image translation is an important research field in computer vision, which is widely associated with Generative Adversarial Networks (GANs) and dual learning. However, the existing methods mainly translate the global image of the source domain to the target domain, which fails to implement instance-level image-to-image translation, and the translation results in the target domain cannot be controlled. In this paper, an instance-level image-to-image translation network (IIT-GAT) is proposed, which includes attention module and feature-encoder module. The attention module is used to guide our model to focus on more interesting instance to generate instance masks, which helps to separate instance and background of an image. The feature-encoder module is used to embed the images into two different spaces: domain-invariant content space and domain-specific attribute space. The content features and attribute features of different images are used as input to generator simultaneously to improve the controllability of image-to-image translation. To this end, we introduce a local self-reconstruction loss that encourages the network to learn the style feature of target instances. Generally, our method not only improves the quality of instance-level image-to-image translation, but also increases controllability on this basis. Extensive experiments are conducted on multiple datasets to validate the effectiveness of the proposed framework, and the results show our method has better performance than previous methods.
What problem does this paper attempt to address?