Abstract:In current hidden Markov model(HMM) based unit selection speech synthesis method, the optimal phone-sized candidate units are selected following the maximum likelihood(ML) criterion of the HMMs trained for various acoustic features. This paper introduces the statistical models for syllable-level F0 features into this method. Different from the frame-level F0 parameters used in the current framework, the pitch contour of the vowel in each syllable and its combination for adjacent syllables are extracted to represent the suprasegmental property of F0 features. A context-dependent statistical model is trained using these syllable-level F0 features and the likelihood function of this model is integrated into the unit selection criterion to evaluate the suprasegmental prosody of a given unit sequence. The conventional dynamic programming search algorithm for the phone-sized unit selection is modified to take into account the dependency between the candidate units for the vowels of adjacent syllables which is caused by the syllable-level F0 modeling. Our experiment results prove that this method can improve the naturalness of synthesized speech significantly.

Statistical modeling of syllable-level F0 features for HMM-based unit selection speech synthesis

Statistical Acoustic Model Based Unit Selection Algorithm for Speech Synthesis

HMM-based Unit Selection Using F

HMM-Based Hierarchical Unit Selection Combining Kullback-Leibler Divergence with Likelihood Criterion

A Hierarchical F0 Modeling Method for HMM-based Speech Synthesis

Improving F0 prediction using bidirectional associative memories and syllable-level F0 features for HMM-based Mandarin speech synthesis

HMM-based Unit Selection Using Frame Sized Speech Segments.

HMM-based Unit Selection Speech Synthesis Using Log Likelihood Ratios Derived from Perceptual Data

Multi-Layer F0 Modeling for HMM-Based Speech Synthesis

Trainable Unit Selection Speech Synthesis under Statistical Framework

Voiced/unvoiced Decision Algorithm for HMM-based Speech Synthesis

HMM-BASED HIERARCHICALUNITSELECTIONCOMBINING KULLBACK-LEIBLER DIVERGENCE WITH LIKELIHOODCRITERION

A Novel Hybrid Approach for Mandarin Speech Synthesis

Selecting optimal non-uniform units for hierarchical unit selection

Hierarchical Non-Uniform Unit Selection Based on Prosodic Structure

Stable boundary-based non-uniform unit selection in speech synthesis

Building HMM based unit-selection speech synthesis system using synthetic speech naturalness evaluation score

Investigation of Prosodie FO Layers in Hierarchical FO Modeling for HMM-based Speech Synthesis

Formant-Controlled HMM-Based Speech Synthesis.

Asynchronous F0 and Spectrum Modeling for HMM-based Speech Synthesis

Improved unit selection speech synthesis method utilizing subjective evaluation results on synthetic speech