DMDIT: Diverse multi-domain image-to-image translation

Mingwen Shao,Youcai Zhang,Huan Liu,Chao Wang,Le Li,Xun Shao
DOI: https://doi.org/10.1016/j.knosys.2021.107311
2021-10-01
Abstract:Cross-domain image translation studies have shown brilliant progress in recent years, which intend to learn the mapping between two different domains. A good cross-domain image translation model should meet the following conditions: (1) don't rely on paired dataset, (2) can deal with multiple domains, (3) obtain diverse outputs with the same source image. Most state-of-art studies are devoted to addressing two of them i.e., either (1) and (2), or (1) and (3). In this paper, we construct a unified diverse multi-domain image to image translation framework (DMDIT) which can satisfy the above three requirements simultaneously. Different from traditional approaches, the proposed generator can achieve diverse and multi-label image-to-image translation while retaining the underlying features of the input image. The diverse outputs are obtained through a latent noise sampled from the normal distribution randomly. To further improve the multiplicity of the outputs, we propose a novel style regularization loss to restrain the latent noise. The mode collapse problem usually occurs due to the lack of constraints on the noise, so we embed a noise separation module in the discriminator to avoid this issue. In addition, we apply an attention mechanism to make the model attentively focus on the most attribute-relevant regions, helping to improve the quality of the generated images. Extensive qualitative and quantitative evaluations clearly demonstrate the effectiveness of our approach.
computer science, artificial intelligence
What problem does this paper attempt to address?