Unpaired semantic neural person image synthesis

Yixiu Liu,Tao Jiang,Pengju Si,Shangdong Zhu,Chenggang Yan,Shuai Wang,Haibing Yin
DOI: https://doi.org/10.1007/s00371-024-03331-4
IF: 2.835
2024-04-03
The Visual Computer
Abstract:Pose-guided person image synthesis is a challenging task that aims to generate photo-realistic images of a person with the same appearance as a source image but the same pose as a target image. Existing methods often suffer from noticeable artifacts due to the omission of multi-view information, and the requirement for paired source–target images in certain methods during training further limits the application of the models. To address these issues, we present a semantic neural person image synthesis framework, named SNPIS, which leverages neural radiance fields (NeRF) to synthesize high-fidelity human images of arbitrary poses from multi-view source images and target semantic maps. First, we introduce the semantic mirror orientation adjustment that forces sampling points to focus on the human body, effectively suppressing background interference and enhancing human details. Then we devise a NeRF-based appearance-shape decoupling generative adversarial network, which separates appearance and shape of a shared volume generated from multi-view source images and corresponding semantic maps. Finally, we use the obtained decoupled generator to synthesize human images guided by the target semantic maps, employing appearance inversion, and optimizing pose reconstruction with semantic consistency constraint. Experimental results show that our approach not only outperforms existing unpaired pose-guided person image synthesis methods, but also competes with many paired methods.
computer science, software engineering
What problem does this paper attempt to address?