Abstract:Pose-guided person image generation that aims to transfer the pose of a given person to a target pose has recently received lots of research attention. Due to the spatial misalignment and occlusions of different local body parts by pose variations, this task is still challenging especially in maintaining high-fidelity textures and body structures in generated images. Besides, most works also suffer from the limited number of texture styles in the given person datasets, restricting the diversity of generated persons' appearances. To solve these problems, we design a Kernel-based Texture-Fusion Joint Refinement Network (TFJR-Net) to jointly refine the structure and texture information of generated images. First, we leverage a bone-map representation to guide the generation of human parsing maps, which has more structure priors and richer context information than traditional key-point maps, thus reduce the uncertainty of generated body structures. Next, a Texture-Kernel Injection Normalization module (TKIN) is proposed to inject the per-region texture-kernel into the corresponding semantic region from the human parsing map, which decouples the texture and shape information, and also preserves fine-grained features for complex textures. Furthermore, we are the first to introduce external texture patterns outside of the dataset in human semantic regions such as the upper clothes. We fuse the two texture domains in a shared texture space through our designed texture-fusion TKIN modules. Extensive experiments are conducted on the Deepfashion dataset, with the DTD dataset as an external texture source. The experimental results demonstrate the superiority of our proposed method in generating persons of better textures and structures than state-of-the-art works, and also show the generalization ability of our proposed method to absorb diversified external textures for generating person images. The source codes are available at https://github.com/pilgrim00/TKIN.

Pose Guided Person Image Generation Via Dual-Task Correlation and Affinity Learning.

Exploring Dual-task Correlation for Pose Guided Person Image Generation

Lightweight Texture Correlation Network for Pose Guided Person Image Generation

Context-Guided Adaptive Network for Efficient Human Pose Estimation.

Exploring Kernel-based Texture Transfer for Pose-guided Person Image Generation

Perceptual Metric-Guided Human Image Generation.

UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer

Progressive and Aligned Pose Attention Transfer for Person Image Generation

Unpaired Person Image Generation With Semantic Parsing Transformation

Symmetrical Siamese Network for Pose-Guided Person Synthesis

CPD-GAN: Cascaded Pyramid Deformation GAN for Pose Transfer

Structure-transformed Texture-enhanced Network for Person Image Synthesis.

PISE: Person Image Synthesis and Editing with Decoupled GAN

RePoseDM: Recurrent Pose Alignment and Gradient Guidance for Pose Guided Image Synthesis

Person Image Synthesis Through Siamese Generative Adversarial Network

One-Shot Learning for Pose-Guided Person Image Synthesis in the Wild

Fusion Embedding for Pose-Guided Person Image Synthesis with Diffusion Model

Multi-scale Information Transport Generative Adversarial Network for Human Pose Transfer

PoNA: Pose-Guided Non-Local Attention for Human Pose Transfer

Progressive Pose Attention Transfer For Person Image Generation

Attentional pixel-wise deformation for pose-based human image generation