Minimum Generation Error Training for HMM-based Prediction of Articulatory Movements

Tian-Yi Zhao,Zhen-Hua Ling,Ming Lei,Li-Rong Dai,Qingfeng Liu
DOI: https://doi.org/10.1109/iscslp.2010.5684840
2010-01-01
Abstract:This paper presents a minimum generation error (MGE) training method for hidden Markov model (HMM) based prediction of articulatory movements when both text and audio inputs are given. In this method, MGE criterion is adopted to replace the maximum likelihood (ML) criterion to estimate model parameters for the unified acoustic-articulatory HMMs. Different from the MGE training for HMM-based acoustic speech synthesis, the generation error used here is defined as the distance between the generated and natural articulatory features. Experimental results show that our proposed method can improve the accuracy of articulatory movement prediction significantly. The average root mean square (RMS) error reduces from 1.002 mm to 0.913 mm on the test set.
What problem does this paper attempt to address?