Abstract:Tone recognition is the core function in Chinese speech perception. The tone perception ability of people with sensorineural hearing loss (SNHL) is often weaker than normal people. Automatically tone enhancement would be useful in helping them understand Chinese speech better. In this paper, we focus on the tone enhancing model for Chinese disyllable words. We first analyze the acoustic features related to tone perception. By agglomerative hierarchical clus tering method, the first and second syllables of disyllable words are clustered into 6 clusters respectively. Discriminative features of the se clusters are experimentally determined from a set of possible features related to tone perception, such as the pitch value, pitch range an d position of minimum pitch, etc. We further propose a practicable tone enhancing model with these discriminative features: 1) an input pitch contour is classified by calculating the distance between it and the centroid of each cluster, and 2) selecting the smallest dis tance, then the unclassified pitch contour belongs to this cluster, 3) the pitch contour is modified for tone enhancement with model p arameters corresponding to this cluster using TD-PSOLA. Both statistical and subjective experiments show that higher hit rate of tone recognition can be obtained after tone enhancement with the proposed model. Especially, the proposed enhancing model can also avoid traditional tone recognition, which is more convictive and less laborious.

Maximum Entropy Based Tone Modeling for Mandarin Speech Recognition

Exploiting Prosodic and Lexical Features for Tone Modeling in A Conditional Random Field Framework

Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information

The Research on Applying Tone Information in Mandarin Speech Recognition

A Maximum Entropy Based Hierarchical Model for Automatic Prosodic Boundary Labeling in Mandarin

An Innovative Prosody Modeling Method for Chinese Speech Recognition

Prosodic boundary prediction based on maximum entropy model with error-driven modification

Tone recognition in mandarin spontaneous speech

Robust F0 Modeling for Mandarin Speech Recognition in Noise.

Real Context Model for Tone Recognition in Mandarin Conversational Telephone Speech

Research on Tone Recognition in Chinese Spontaneous Speech

Prosodic Modeling with Rich Syntactic Context in HMM-based Mandarin Speech Synthesis

Modeling Prosody Patterns for Chinese Expressive Text-to-speech Synthesis

A Real-Time Tone Enhancement Method for Continuous Mandarin Speeches

Automatic detection of tone mispronunciation in mandarin

PHMM Based Asynchronous Acoustic Model for Chinese Large Vocabulary Continuous Speech Recognition

Probing the phonetic and phonological knowledge of tones in Mandarin TTS models

Prosody Model for Mandarin Text-to-Speech System

Tone Enhancing Model for Disyllable Words in Chinese Mandarin Speech

End-to-End Mandarin Tone Classification with Short Term Context Information

ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis