Abstract:A framework for dialectal Chinese speech recognition is proposed and studied, in which a relatively small dialectal Chinese (or in other words Chinese influenced by the native dialect) speech corpus and dialect-related knowledge are adopted to transform a standard Chinese (or Putonghua, abbreviated as PTH) speech recognizer into a dialectal Chinese speech recognizer. Two kinds of knowledge sources are explored: one is expert knowledge and the other is a small dialectal Chinese corpus. These knowledge sources provide information at four levels: phonetic level, lexicon level, language level, and acoustic decoder level. This paper takes Wu dialectal Chinese (WDC) as an example target language. The goal is to establish a WDC speech recognizer from an existing PTH speech recognizer based on the Initial-Final structure of the Chinese language and a study of how dialectal Chinese speakers speak Putonghua. The authors propose to use context-independent PTH-IF mappings (where IF means either a Chinese Initial or a Chinese Final), context-independent WDC-IF mappings, and syllable-dependent WDC-IF mappings (obtained from either experts or data), and combine them with the supervised maximum likelihood linear regression (MLLR) acoustic model adaptation method. To reduce the size of the multi-pronunciation lexicon introduced by the IF mappings, which might also enlarge the lexicon confusion and hence lead to the performance degradation, a Multi-Pronunciation Expansion (MPE) method based on the accumulated uni-gram probability (AUP) is proposed. In addition, some commonly used WDC words are selected and added to the lexicon. Compared with the original PTH speech recognizer, the resulting WDC speech recognizer achieves 10–18% absolute Character Error Rate (CER) reduction when recognizing WDC, with only a 0.62% CER increase when recognizing PTH. The proposed framework and methods are expected to work not only for Wu dialectal Chinese but also for other dialectal Chinese languages and even other languages.

Easytalk: a large-vocabulary speaker-independent Chinese dictation machine

A Large-Vocabulary Chinese Speech Recognition System.

A Computer System for Chinese Character Speech Input

EasyCmd: Navigation by Voice Commands

Efficient Embedded Speech Recognition for Very Large Vocabulary Mandarin Car-Navigation Systems

A Real-World Speech Recognition System Based on CDCPMs

A Real-World Large Vocabulary Speaker-Independent Speech Recognition System

End-to-end Code-switched TTS with Mix of Monolingual Recordings.

HandTalker II: a Chinese sign language recognition and synthesis system

Research on speech recognition models in the Chinese dictation machine

Speech recognition system on chip based on 5507 DSP

The Implementation of a Practical Chinese Speech Recognition System For The Parcel Post Checking Task

The dynamically-adjustable histogram pruning method for embedded voice dialing

A Dialectal Chinese Speech Recognition Framework

Real-Time Speech Recognition Method for Embedded System

A System for Mandarin Short Phrase Recognition on Portable Devices

Silenttalk: Lip Reading Through Ultrasonic Sensing on Mobile Phones

A Chinese Text-to-Speech System

Rapidly Developing Spoken Chinese Dialogue Systems with the D-Ear SDS SDK.

English Speech Recognition System on Chip

Design and implementation of real-time telephone speech recognition system using DSP TMS320C31