Preserve ordering property of generated LSPS for minimum generation error training in HMM-based speech synthesis

Ming Lei,Zhen-Hua Ling,Li-Rong Dai
DOI: https://doi.org/10.1109/ICASSP.2011.5947407
2011-01-01
ICASSP
Abstract:Ordering property is an important property of LSP and closely connected with the naturalness of reconstructed speech. When LSP is adopted as spectrum feature in HMM-based parametric speech synthesis, the ordering property cannot be guaranteed because diagonal covariance matrix is used in conventional system and the cross dimension correlation of LSP vector is ignored. It will cause un stable issue in synthesized speech. In this paper, we propose some methods to preserve the ordering property of generated LSPs for MGE training by introducing mis-ordering related distance measurements into model training criterion. Experimental results show that two methods can alleviate the mis-orderings significantly without degrading the MGE performance, and one of which, the minimum mis-ordering counting method, requires no acoustic observations for model optimization.
What problem does this paper attempt to address?