Abstract:Tone recognition is the core function in Chinese speech perception. The tone perception ability of people with sensorineural hearing loss (SNHL) is often weaker than normal people. Automatically tone enhancement would be useful in helping them understand Chinese speech better. In this paper, we focus on the tone enhancing model for Chinese disyllable words. We first analyze the acoustic features related to tone perception. By agglomerative hierarchical clus tering method, the first and second syllables of disyllable words are clustered into 6 clusters respectively. Discriminative features of the se clusters are experimentally determined from a set of possible features related to tone perception, such as the pitch value, pitch range an d position of minimum pitch, etc. We further propose a practicable tone enhancing model with these discriminative features: 1) an input pitch contour is classified by calculating the distance between it and the centroid of each cluster, and 2) selecting the smallest dis tance, then the unclassified pitch contour belongs to this cluster, 3) the pitch contour is modified for tone enhancement with model p arameters corresponding to this cluster using TD-PSOLA. Both statistical and subjective experiments show that higher hit rate of tone recognition can be obtained after tone enhancement with the proposed model. Especially, the proposed enhancing model can also avoid traditional tone recognition, which is more convictive and less laborious.

Exploiting Prosodic and Lexical Features for Tone Modeling in A Conditional Random Field Framework

Maximum Entropy Based Tone Modeling for Mandarin Speech Recognition

Mandarin tone modeling using recurrent neural networks

Real Context Model for Tone Recognition in Mandarin Conversational Telephone Speech

Tone recognition in mandarin spontaneous speech

Research on Tone Recognition in Chinese Spontaneous Speech

The Research on Applying Tone Information in Mandarin Speech Recognition

An Innovative Prosody Modeling Method for Chinese Speech Recognition

End-to-End Mandarin Tone Classification with Short Term Context Information

TONE RECOGNITION OF CHINESE CONTINUOUS SPEECH

Tone Enhancing Model for Disyllable Words in Chinese Mandarin Speech

Study on the Identification of the Isolated Word in Mandarin Speech Recognition with Tone Information

Robust F0 Modeling for Mandarin Speech Recognition in Noise.

PHMM Based Asynchronous Acoustic Model for Chinese Large Vocabulary Continuous Speech Recognition

A Real-Time Tone Enhancement Method for Continuous Mandarin Speeches

Probing the phonetic and phonological knowledge of tones in Mandarin TTS models

A Superposed Prosodic Model for Chinese Text-To-Speech Synthesis

Robust Audio-Visual Mandarin Speech Recognition Based on Adaptive Decision Fusion and Tone Features

Using Conditional Random Fields to Predict Focus Word Pair in Spontaneous Spoken English

Blstm-Crf Based End-To-End Prosodic Boundary Prediction With Context Sensitive Embeddings In A Text-To-Speech Front-End

Investigation of Modeling Units for Mandarin Speech Recognition Using Dfsmn-ctc-smbr