Pose-Normalized and Appearance-Preserved Street-to-Shop Clothing Image Generation and Feature Learning

Huijing Zhan,Chenyu Yi,Boxin Shi,Jie Lin,Ling-Yu Duan,Alex C. Kot
DOI: https://doi.org/10.1109/TMM.2020.2978669
IF: 7.3
2021-01-01
IEEE Transactions on Multimedia
Abstract:We tackle the task of street-to-shop clothing image synthesis. Given a daily person image with a particular clothing item captured in the street scenario, we aim to synthesize the frontal facing view of that item in the shop scenario. This problem has the following challenges: 1) the distinct visual discrepancy between the street and shop scenario; 2) the severe shape deformation of clothing in the presence of an arbitrary human pose; 3) the preservation of fine-grained details during the process of clothing image generation. In this paper, we jointly solve these difficulties by proposing a Pose-Normalized and Appearance-Preserved Generative Adversarial Network (PNAP-GAN). More specifically, conditioned on the clothing-agnostic representation (i.e., clothing landmarks and semantic parsing map), we disentangle the shape and appearance synthesis in a coarse-to-fine framework. Moreover, a semantic embedding loss is introduced to guide the domain transfer in the semantic level (i.e., keeping the clothing attributes). With the synthesized frontal shop image, a pose-normalized representation in complementary to the domain-invariant feature learnt from the original street image are integrated to facilitate the problem of street-to-shop clothing retrieval. Extensive experiments conducted demonstrate the effectiveness of the proposed PNAP-GAN on generating high quality frontal-view images and the excellence of the learnt pose-normalized features on the retrieval task than existing methods. In addition, we demonstrate that the pose-normalized retrieval feature benefits the cross-scenario (i.e., street-to-shop) clothing image generation in a semantic-preserved manner.
What problem does this paper attempt to address?