A Full Training Framework of Cross-Stream Dependence Modelling for HMM-based Singing Voice Synthesis

Xin Wang,Minghui Dong,Zhen-Hua Ling
DOI: https://doi.org/10.1109/icassp.2016.7472662
2016-01-01
Abstract:A cross-stream dependence modelling (CSDM) method has been proposed to model the dependence of spectral distributions on F0 observations for hidden Markov model (HMM) based speech synthesis. However, this method incorporates CSDM only for the embedded training of HMM estimation while ignoring CSDM in the clustering of context-dependent HMMs. This paper applies CSDM to HMM-based singing voice synthesis and presents a decision-tree-based model clustering method with explicit CSDM. This method, in conjunction with the previous CSDM method, forms a full CSDM training framework. Experimental results demonstrate that this full CSDM training framework achieves better performance than the previous CSDM method and the baseline without CSDM in a singing voice synthesis task.
What problem does this paper attempt to address?