Estimation of Window Coefficients for Dynamic Feature Extraction for HMM-Based Speech Synthesis.

Ling-Hui Chen,Yoshihiko Nankaku,Heiga Zen,Keiichi Tokuda,Zhen-Hua Ling,Li-Rong Dai
DOI: https://doi.org/10.21437/interspeech.2011-33
2011-01-01
Abstract:In standard approaches to hidden Markov model (HMM)-based speech synthesis, window coefficients for calculating dynamic features are pre-determined and fixed. This may not be optimal to capture various context-dependent dynamic characteristics in speech signals. This paper proposes a data-driven technique to estimate the window coefficients. They are optimized so as to maximize the likelihood of trajectory HMMs given data. Experimental results show that the proposed technique can achieve a comparable performance with the mean- and variance-updated trajectory HMMs in the naturalness of synthesized speech, while offering significantly lower computational cost.
What problem does this paper attempt to address?