APB2FACE: Audio-Guided Face Reenactment with Auxiliary Pose and Blink Signals.

Jiangning Zhang,Liang Liu,Zhucun Xue,Yong Liu
DOI: https://doi.org/10.1109/icassp40776.2020.9052977
2020-01-01
Abstract:Audio-guided face reenactment aims at generating photorealistic faces using audio information while maintaining the same facial movement as when speaking to a real person. However, existing methods can not generate vivid face images or only reenact low-resolution faces, which limits the application value. To solve those problems, we propose a novel deep neural network named APB2Face, which consists of GeometryPredictor and FaceReenactor modules. GeometryPredictor uses extra head pose and blink state signals as well as audio to predict the latent landmark geometry information, while FaceReenactor inputs the face landmark image to reenact the photorealistic face. A new dataset AnnV I collected from YouTube is presented to support the approach, and experimental results indicate the superiority of our method than state-of-the-arts, whether in authenticity or controllability.
What problem does this paper attempt to address?