BodyGAN: General-purpose Controllable Neural Human Body Generation
Chaojie Yang,Hanhui Li,Shengjie Wu,Shengkai Zhang,Haonan Yan,Nianhong Jiao,Jie Tang,Runnan Zhou,Xiaodan Liang,Tianxiang Zheng
DOI: https://doi.org/10.1109/cvpr52688.2022.00758
2022-01-01
Abstract:Recent advances in generative adversarial networks (GANs) have provided potential solutions for photo-realistic human image synthesis. However, the explicit and individual control of synthesis over multiple factors, such as poses, body shapes, and skin colors, remains difficult for existing methods. This is because current methods mainly rely on a single pose/appearance model, which is limited in dis-entangling various poses and appearance in human images. In addition, such a unimodal strategy is prone to causing severe artifacts in the generated images like color distortions and unrealistic textures. To tackle these issues, this paper proposes a multi-factor conditioned method dubbed BodyGAN. Specifically, given a source image, our Body-GAN aims at capturing the characteristics of the human body from multiple aspects: (i) A pose encoding branch consisting of three hybrid subnetworks is adopted, to generate the semantic segmentation based representation, the 3D surface based representation, and the key point based rep-resentation of the human body, respectively. (ii) Based on the segmentation results, an appearance encoding branch is used to obtain the appearance information of the human body parts. (iii) The outputs of these two branches are represented by user-editable condition maps, which are then processed by a generator to predict the synthesized image. In this way, our BodyGAN can achieve the fine-grained dis-entanglement of pose, body shape, and appearance, and consequently enable the explicit and effective control of syn-thesis with diverse conditions. Extensive experiments on multiple datasets and a comprehensive user study show that our BodyGAN achieves the state-of-the-art performance.