A face template: Improving the face generation quality of multi-stage generative adversarial networks using coarse-grained facial priors

Yu Wang,Hongxia Wang
DOI: https://doi.org/10.1007/s11042-023-16183-2
IF: 2.577
2023-07-29
Multimedia Tools and Applications
Abstract:The current text-to-face synthesis models only utilize text descriptions for image synthesis, neglecting the prior information of basic facial features. This leads to insufficient learning of both coarse-grained facial features, such as face shape and positions of the basic organs, and fine-grained facial features, such as facial wrinkles and hair textures, by the generators in each stage. As a result, the quality of the generated faces is low. To address this issue, we propose a generic face template. It includes only common facial information, such as face contour, shapes and relative positions of the basic organs. Moreover, to embed the face template into the first-stage generator of three stages for assisting face generation, we design a Facial Coarse-grained Feature Excitation Module (FCFEM). FCFEM extracts the coarse-grained feature channel weights of the face template. And it performs channel recalibration on the intermediate feature maps of the first-stage generator. This helps to generate more precise and complete initial face images. Therefore, it can enhance the ability of the generators in the latter two stages to learn fine-grained features. Experiments on the Multi-Modal CelebA-HQ dataset demonstrate that the multi-stage models using our method generate face images with higher quality and realism compared to the original models. They also achieve higher semantic consistency between generated images and text descriptions.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?