Pose image generation for video content creation using controlled human pose image generation GAN
Lalit Kumar,Dushyant Kumar Singh,Kumar, Lalit
DOI: https://doi.org/10.1007/s11042-023-17856-8
IF: 2.577
2023-12-22
Multimedia Tools and Applications
Abstract:Human pose image generation is a challenging task in computer vision with applications in animation, gaming, and virtual reality. This paper presents a novel approach for Human pose image generation GAN using a combination of U-Net architecture, Decomposed Component encoder, Laplacian image enhancement technique in the generator, and Siamese Networks with Alignment Loss in the discriminator. The generator network leverages the U-Net architecture to capture and reconstruct intricate details of human poses. Additionally, a Decomposed Component encoder is incorporated to learn disentangled representations of pose attributes, such as body position, joint angles, and limb orientations. Furthermore, a Laplacian image enhancement technique is applied to enhance the visual quality and sharpness of the generated pose images. To ensure the alignment and similarity of the generated pose images with real poses, Siamese Networks with Alignment Loss are integrated into the discriminator. The Siamese Networks enable pairwise comparison between generated and real poses, guiding the discriminator to learn features that capture the alignment relationship. The alignment loss is computed based on the similarity of feature representations extracted by the Siamese Networks. This proposed model focuses on enhancing the level of detail and realism in the pose representation rather than improving the resolution or sharpness of the image itself so the U-Net architecture and image sharpening algorithm helps the proposed model to generate more realistic images in comparison to the other state-of-art approaches. This manuscript also contains a comparative analysis of experimental results on various physical conditions with the previously existing state-of-the-art approaches. This analysis shows that the proposed algorithm generates 15ā20% more realistic images in comparison to other state-of-the-art approaches.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering