Improving the Performance of HMM-based Voice Conversion Using Context Clustering Decision Tree and Appropriate Regression Matrix Format.

Long Qin,Yi-jian Wu,Zhen-Hua Ling,Ren-Hua Wang
DOI: https://doi.org/10.21437/interspeech.2006-578
2006-01-01
Abstract:To improve the performance of the HMM-based voice conversion system in which the LSP coefficient is introduced as the spectral representation, a model clustering technique to tie HMMs into classes for the model adaptation, considering the phonetic and linguistic contextual factors of HMMs, is adopted in this paper. Besides, due to the relationship between the LSP coefficients of adjacent orders, an appropriate format of the regression matrix is suggested according to the small amount of the adaptation training data. Subjective and objective tests prove that the source HMMs can be adapted more accurately using the proposed method, meanwhile the synthetic speech generated from the adapted model has better discrimination and speech quality.
What problem does this paper attempt to address?