Bi-level Codebook Based Speech-driven Visual-speech Synthesis System

Xi-bin JIA,Bao-cai YIN,Yan-fen SUN
DOI: https://doi.org/10.3969/j.issn.1002-137X.2014.01.018
2014-01-01
Computer Science
Abstract:The paper proposed a bi-level codebook based speech-driven visual-speech synthesis system.The system uses the vector quantization principle to establish a coarse-coupling mapping relationship from the speech feature space to the visual speech feature space.In order to enhance the relationship between the speech and the visual speech,the system makes the unsupervising-clustering on the sample data according to the similarity of both the acoustic speech and the visual speech and constructs the bi-level mapping codebook reflecting the similarity of both the acoustic speech and the visual speech.At the stage of preprocessing,the paper proposed a joint feature model,which reflects the geometric character and the visibility of teeth.The paper also proposed an approach to extract the visual speech correlative speech feature from the speech features of LPCC and MFCC on the basis of genetic algorithm.The comparison results between the synthesis image sequences with the original one show that the synthesis one can approximate the original one and the result is good.In the future research,the restriction between the visual speech contexts should be considered to improve the smoothness of the synthesis results.
What problem does this paper attempt to address?