Abstract:Human pose image generation is a challenging task in computer vision with applications in animation, gaming, and virtual reality. This paper presents a novel approach for Human pose image generation GAN using a combination of U-Net architecture, Decomposed Component encoder, Laplacian image enhancement technique in the generator, and Siamese Networks with Alignment Loss in the discriminator. The generator network leverages the U-Net architecture to capture and reconstruct intricate details of human poses. Additionally, a Decomposed Component encoder is incorporated to learn disentangled representations of pose attributes, such as body position, joint angles, and limb orientations. Furthermore, a Laplacian image enhancement technique is applied to enhance the visual quality and sharpness of the generated pose images. To ensure the alignment and similarity of the generated pose images with real poses, Siamese Networks with Alignment Loss are integrated into the discriminator. The Siamese Networks enable pairwise comparison between generated and real poses, guiding the discriminator to learn features that capture the alignment relationship. The alignment loss is computed based on the similarity of feature representations extracted by the Siamese Networks. This proposed model focuses on enhancing the level of detail and realism in the pose representation rather than improving the resolution or sharpness of the image itself so the U-Net architecture and image sharpening algorithm helps the proposed model to generate more realistic images in comparison to the other state-of-art approaches. This manuscript also contains a comparative analysis of experimental results on various physical conditions with the previously existing state-of-the-art approaches. This analysis shows that the proposed algorithm generates 15–20% more realistic images in comparison to other state-of-the-art approaches.

Pose Generator ( G ) : Head : R arm : L arm : Chest : R leg : L leg Plausible Pose

Adversarial PoseNet: A Structure-aware Convolutional Network for Human Pose Estimation

Adversarial Learning of Structure-Aware Fully Convolutional Networks for Landmark Localization

An adversarial human pose estimation network injected with graph structure

Context-Guided Adaptive Network for Efficient Human Pose Estimation.

Pose image generation for video content creation using controlled human pose image generation GAN

Attention-Guided GANs for Human Pose Transfer

Mutually Activated Residual Linear Modeling GAN for Pose-Guided Person Image Generation

Pose Guided Human Video Generation

Learning Causal Representation for Training Cross-Domain Pose Estimator Via Generative Interventions

Two Birds with One Stone: Iteratively Learn Facial Attributes with GANs.

GRPose: Learning Graph Relations for Human Image Generation with Pose Priors

3D Human Pose Estimation Based on 2D-3D Consistency with Synchronized Adversarial Training

PoNA: Pose-Guided Non-Local Attention for Human Pose Transfer

PoseGU: 3D Human Pose Estimation with Novel Human Pose Generator and Unbiased Learning

Pose and Color-Gamut Guided Generative Adversarial Network for Pedestrian Image Synthesis

LSG-GAN: Latent space guided generative adversarial network for person pose transfer

3D human pose estimation based on 2D–3D consistency with synchronized adversarial training

Pose with style

Transformation guided representation GAN for pose invariant face recognition

GFPose: Learning 3D Human Pose Prior with Gradient Fields