Abstract:In this paper, we propose Occlusion-Aware Warping GAN (OAW-GAN), a unified Human Video Synthesis (HVS) framework that can uniformly tackle human video motion transfer, attribute editing, as well as inpainting. This is the first work to our knowledge that can handle all these tasks within a one-time trained model. Although existing GAN-based HVS methods have achieved great success, they either can’t preserve appearance details due to the loss of spatial consistency between the synthesized target frames and the input source images, or generate incoherent video results due to the loss of temporal consistency among frames. Besides, most of them lack the ability to create new contents while keeping existing ones, failing especially when some regions in the target are invisible in the source due to self-occlusion. To address these limitations, we first introduce Coarse-to-Fine Flow Warping Network (C2F-FWN) to estimate spatial-temporal consistent transformation between source and target, as well as occlusion mask indicating which parts in the target are invisible in the source. Then, the flow and the mask are scaled and fed into the pyramidal stages of our OAW-GAN, guiding Occlusion-Aware Synthesis (OAS) that can be abstracted into visible part re-utilization and invisible part inpainting at the feature level, which effectively alleviates the self-occlusion problem. Extensive experiments conducted on both human video (i.e., iPER, SoloDance)Keywords are desired. please provide if necessary. and image (i.e., DeepFashion) datasets demonstrate the superiority of our approach to existing state-of-the-arts. We also show that, besides motion transfer task that previous works concern, our framework can further achieve attribute editing and texture inpainting, which paves the way towards unified HVS.

Hierarchy Composition GAN for High-fidelity Image Synthesis

Fine-grained Semantic Constraint in Image Synthesis

OAW-GAN: Occlusion-Aware Warping GAN for Unified Human Video Synthesis

SpatialGAN: Progressive Image Generation Based on Spatial Recursive Adversarial Expansion

Spatial Fusion GAN for Image Synthesis

Compositional GAN: Learning Image-Conditional Binary Composition

Compositional GAN: Learning Conditional Image Composition

Customizable GAN: Customizable Image Synthesis Based on Adversarial Learning.

FBC-GAN: Diverse and Flexible Image Synthesis via Foreground-Background Composition

3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis

Composition-Aided Face Photo-Sketch Synthesis.

Towards Realistic Face Photo-Sketch Synthesis via Composition-Aided GANs

Dual Attention GANs for Semantic Image Synthesis

MT-GAN: toward realistic image composition based on spatial features

Inducing Hierarchical Compositional Model by Sparsifying Generator Network

SAC-GAN: Structure-Aware Image Composition

Multi-view Consistent Generative Adversarial Networks for Compositional 3D-Aware Image Synthesis

BachGAN: High-Resolution Image Synthesis from Salient Object Layout

Toward Realistic Face Photo–Sketch Synthesis via Composition-Aided GANs

DE-GAN: Domain Embedded GAN for High Quality Face Image Inpainting