Cross-modal and Semantics-Augmented Asymmetric CycleGAN for Data-Imbalanced Anime Style Face Translation

Shiping Deng,Kaoru Uchida,Zhengwei Yin
DOI: https://doi.org/10.1145/3503961.3503969
2021-11-19
Abstract:Human face to anime face translation has attracted the attention of many researchers in recent years, and various works have achieved high-quality style transfer on conventional tasks. However, existing works often have fatal shortcomings when the target domain training data is heavily insufficient, which is named as imbalanced setting. Here the imbalanced (low-resource) task, generally means there is no sufficient data on the training dataset compared with the conventional task, e.g. the training data size is less than 100. To solve this problem, we propose a multi-modal translation model for a specific style. Based on the cyclic adversarial network and class activation map, we import semantic modality to enhance data information and attention modules, which help the model focus more on the discriminative areas between source and target domain. The experimental results show that our method has superiority in low-resource settings compared with similar existing work.
What problem does this paper attempt to address?