Data Mining and Speech Driven Face Animation

陈益强,高文,王兆其,姜大龙,左力
DOI: https://doi.org/10.3969/j.issn.1004-731X.2002.04.025
2002-01-01
Abstract:In this paper, we present a data-mining framework in audio-visual interaction, several methods including neural network, unsupervised clustering and statistical method are used to learn synchronous pattern for speech driven face animation from large recorded audio-visual database, then apply this with an audio to produce realistic whole-face action, including lip-syncing and upper-face expression, with correct dynamics and co-articulation. The proposed method not only automatically incorporates vocal and facial dynamics such as co-articulation, but also is characterized with easy training, more robust, extensible and interpretable. The performance of our system shows that the proposed learning algorithm is suitable, which greatly improves the realism of face animation during speech.
What problem does this paper attempt to address?