Evaluation of parameter generation using high order dynamic features and long span windows for HMM based speech synthesis

Yang Wang,Jianhua Tao
DOI: https://doi.org/10.1109/ISCSLP.2014.6936663
2014-01-01
Abstract:The essence of speech parameter generation from HMMs using dynamic features is to take full advantage of equation constraints between static and dynamic features, suppressing stepwise parameter sequence of consecutive mean vectors and forcing the generated sequence to be smooth. The equation constraints are demonstrated to be useful; however, the number of constraints and their concrete values are seldom investigated systematically and thoroughly. This paper considers many possible forms of high order dynamic features and long span windows by experimental evaluation. Objective and subjective experiments show that it is helpful to add the third order dynamics to the conventional configuration to achieve better performance for the evaluated male speaker. Moreover, more high order dynamics reduce unvoiced/voiced decision error rate, while just utilizing the first order dynamics minimizes reconstruction error on spectrum and fundamental frequency simultaneously.
What problem does this paper attempt to address?