GMM-based Voice Conversion with Explicit Modelling on Feature Transform

Ling-Hui Chen,Zhen-Hua Ling,Wu Guo,Li-Rong Dai
DOI: https://doi.org/10.1109/iscslp.2010.5684869
2010-01-01
Abstract:In this paper, we propose a Gaussian mixture model (GMM) based voice conversion method using explicit feature transform models. A piecewise linear transform with stochastic bias is adopted to present the relationship between the spectral features of source and target speakers. This explicit transformations are integrated into the training of GMM for the joint probability density of source and target features. The maximum likelihood parameter generation algorithm with dynamic features is used to generate the converted spectral trajectories. Our method can model the cross-dimension correlations for the joint density GMM (JDGMM), while significantly decreasing computation cost comparing with JDGMM with full covariance. Experimental results show that the proposed method outperformed the conventional GMM-based method in cross-gender voice conversion.
What problem does this paper attempt to address?