Restricted Boltzmann Machine Based Spectrum Modeling and Unit Selection Speech Synthesis Method

Yang SONG,Zhen﹣Hua LING,Li﹣Rong DAI
DOI: https://doi.org/10.16451/j.cnki.issn1003﹣6059.201508001
2015-01-01
Abstract:A restricted Boltzmann machine based spectrum modeling and unit selection speech synthesis method is proposed. At the model training stage, the restricted Boltzmann machine is used to model spectral features with rich details, such as spectral envelopes and short﹣time spectral amplitudes, instead of using the single Gaussian model with diagonal variance and mel﹣cepstrum feature for spectral model in the traditional approach. Thus, the description capability of the acoustical model for spectral feature is improved. At the speech synthesis stage, the restricted Boltzmann machine model is adopted to calculate the log likelihoods of spectral feature of candidate sample, and a method of piecewise linear mapping is proposed to construct target cost function for unit selection. The experimental results indicate that the proposed method can effectively improve the naturalness of synthetic speech.
What problem does this paper attempt to address?