Abstract:Tone study is very important for Mandarin speech recognition. In this paper, a Mixture Stochastic Polynomial Tone Model (MSPTM) is proposed for tone modeling in continuous Mandarin speech. In this model the pitch contour, main representative of tone pattern, is described as a mixed stochastic trajectory. The mean trajectory is represented by a polynomial function of normalized time while the variance is time varying. Effective training and tone recognition algorithms were developed. The experimental results based on the proposed MSPTM showed 40.7% tone recognition error rate reduction relative to the traditional Hidden Markov Model (HMM) tone model. We also present a decision tree based approach to learning the tone pattern variation in continuous speech. The phonetic and linguistic factors that may affect the tone patterns were taken into consideration while constructing the tree. After the tree was established, 28 different tone patterns were obtained. We found that in addition to the tone of the neighboring syllable, Consonant/Vowel type of the syllable and the position of the syllable in the utterance also made important contributions to tone pattern variations in continuous speech. Finally, a new approach of integrating tone information into the search process at word level is discussed. Experiments on continuous Mandarin speech recognition showed that the new tone model and tone information integration method were efficient, achieving a 16.2% relative character error rate reduction.

Tone Model Integration Using Tree Based Weight Parameter Tying in Mandarin Speech Recognition

Automatic Context Induction for Tone Model Integration in Mandarin Speech Recognition

Tone Model Integration Based on Discriminative Weight Training for Putonghua Speech Recognition

Discriminative Tone Model Training and Optimal Integration for Mandarin Speech Recognition

Decision Tree Based Mandarin Tone Model And Its Application To Speech Recognition

Tone Modeling Based on Discriminative Training for Mandarin Speech Recognition

Tone Modeling for Continuous Mandarin Speech Recognition

Discriminative Incorporation of Explicitly Trained Tone Models into Lattice Based Rescoring for Mandarin Speech Recognition

Tone modeling based on hidden conditional random fields and discriminative model weight training

Maximum Entropy Based Tone Modeling for Mandarin Speech Recognition

Refining Context-Dependent Tonal Acoustic Modeling in Mandarin LVCSR

Mandarin tone recognition considering context information

Lattice Based Discriminative Model Combination Using Automatically Induced Phonetic Contexts.

Research on Context-Dependent Acoustical Unit (Triphone) for Mandarin Continuous Speech Recognition

Exploiting Prosodic and Lexical Features for Tone Modeling in A Conditional Random Field Framework

Improved Speech Recognition Using Discriminative Integration of Multiple Local Classifiers in Lattice Rescoring

An Investigation of the Target Approximation Model for Tone Modeling and Recognition in Continuous Mandarin Speech.

Real Context Model for Tone Recognition in Mandarin Conversational Telephone Speech

ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis

Integrated Tone Evaluation in Mandarin CALL Systems Using Competing Model Based Approach

Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information