Pose Guided Global and Local GAN for Appearance Preserving Human Video Prediction

Jilin Tang,Haoji Hu,Qiang Zhou,Hangguan Shan,Chuan Tian,Tony Q. S. Quek
DOI: https://doi.org/10.1109/icip.2019.8803792
2019-01-01
Abstract:We propose a pose-guided approach for appearance preserving video prediction by combining global and local information using Generative Adversarial Networks (GANs). The aim is to predict the subsequent frames based on previous frames of human action videos. Considering that human action videos contain both background scenes which are relatively time-invariant among frames, and human actions which are time-varying components, we use a global GAN to model the time-invariant background and coarse human profiles. Then, a local GAN is utilized to further refine the time-varying human parts. Finally, we use a 3D auto-encoder to fine-tune the frame-by-frame images to obtain the whole predicted video. We evaluate our model on the Penn Action and J-HMDB datasets and demonstrate the superiority of our proposed method over other state-of-the-art methods.
What problem does this paper attempt to address?