Abstract:Human pose image generation is a challenging task in computer vision with applications in animation, gaming, and virtual reality. This paper presents a novel approach for Human pose image generation GAN using a combination of U-Net architecture, Decomposed Component encoder, Laplacian image enhancement technique in the generator, and Siamese Networks with Alignment Loss in the discriminator. The generator network leverages the U-Net architecture to capture and reconstruct intricate details of human poses. Additionally, a Decomposed Component encoder is incorporated to learn disentangled representations of pose attributes, such as body position, joint angles, and limb orientations. Furthermore, a Laplacian image enhancement technique is applied to enhance the visual quality and sharpness of the generated pose images. To ensure the alignment and similarity of the generated pose images with real poses, Siamese Networks with Alignment Loss are integrated into the discriminator. The Siamese Networks enable pairwise comparison between generated and real poses, guiding the discriminator to learn features that capture the alignment relationship. The alignment loss is computed based on the similarity of feature representations extracted by the Siamese Networks. This proposed model focuses on enhancing the level of detail and realism in the pose representation rather than improving the resolution or sharpness of the image itself so the U-Net architecture and image sharpening algorithm helps the proposed model to generate more realistic images in comparison to the other state-of-art approaches. This manuscript also contains a comparative analysis of experimental results on various physical conditions with the previously existing state-of-the-art approaches. This analysis shows that the proposed algorithm generates 15–20% more realistic images in comparison to other state-of-the-art approaches.

Mutually Activated Residual Linear Modeling GAN for Pose-Guided Person Image Generation

LSG-GAN: Latent space guided generative adversarial network for person pose transfer

Two Birds with One Stone: Iteratively Learn Facial Attributes with GANs.

Precise Correspondence Enhanced GAN for Person Image Generation

Two Birds with One Stone: Transforming and Generating Facial Images with Iterative GAN

Pose Generator ( G ) : Head : R arm : L arm : Chest : R leg : L leg Plausible Pose

Attention-Guided GANs for Human Pose Transfer

Pose image generation for video content creation using controlled human pose image generation GAN

Verbal-Person Nets: Pose-Guided Multi-Granularity Language-to-Person Generation

Pose and Color-Gamut Guided Generative Adversarial Network for Pedestrian Image Synthesis

PoNA: Pose-Guided Non-Local Attention for Human Pose Transfer

Precise Region Semantics‐assisted GAN for Pose‐guided Person Image Generation

Progressive and Aligned Pose Attention Transfer for Person Image Generation

Pose Guided Human Video Generation

Transformation guided representation GAN for pose invariant face recognition

XingGAN for Person Image Generation

An adversarial human pose estimation network injected with graph structure

Semi-Latent GAN: Learning to Generate and Modify Facial Images from Attributes.

Correspondence Learning for Controllable Person Image Generation

Pose Guided Global and Local GAN for Appearance Preserving Human Video Prediction

MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation