Multi-speaker Prosodic Instance Selection for HMM-based Speech Synthesis

Yansuo Yu,Fengyun Zhu,Xihong Wu
DOI: https://doi.org/10.1109/chinasip.2013.6625315
2013-01-01
Abstract:In this paper, we propose a novel parametric speech synthesis based on prosodic instance selection to improve the naturalness of synthesized speech especially in the case of small database. Prosodic instances including F0 and duration are directly selected from the database to preserve rich prosodic variations other than generation from the statistical models. Considering that spectral and prosodic parameters could be modeled separately, prosodic instances from multiple speakers, which are easier to obtain than those of single speaker, are exploited to not only enhance the prosodic models but also enrich the coverage of prosodic context for the synthesized speaker. The results of subjective listening tests on the corresponding databases further show that the proposed method can achieve better performance than both parametric synthesis and waveform concatenation synthesis.
What problem does this paper attempt to address?