Realistic talking face animation with speech-induced head motion

Sandika Biswas,Sanjana Sinha,Dipanjan Das,Brojeshwar Bhowmick
DOI: https://doi.org/10.1145/3490035.3490305
2021-12-19
Abstract:The recent advancements on talking face generation from speech have mostly focused on lip synchronization, realistic facial movements like eye blinks, eye brow motions but do not generate meaningful head motions according to the speech. This results in a lack of realism, especially in long speech. A very few recent methods try to animate the head motions, but they mostly rely on a short driving head motion video. In general, the prediction of head motion is largely dependent upon the prosodic information of the speech at a current time window. In this paper, we propose a method for generating speech-driven realistic talking face animation which has speech-coherent head motions with accurate lip sync, natural eye-blink, and high fidelity texture. In particular, we propose an attention-based GAN network to identify the highly correlated audio with the speaker's head motion and learn the relationship between the prosodic information of the speech and the corresponding head motions. Experimental results show that our animations are significantly better in terms of output video quality, realism of head movements, lip sync, and eye-blinks when compared to state-of-the-art methods, both qualitatively and quantitatively. Moreover, our user study shows that our speech-coherent head motions make the animation more appealing to the users.
What problem does this paper attempt to address?