A CANTONESE ACCENT CHINESE SPEECH CORPUS

Shuqing Li,Fang Zheng,Mingxing Xu,Ziqi Song,Ditang Fang
1999-01-01
Abstract:To meet the needs of research in the robustness of the continuous Chinese speech recognition systems, we have established a Cantonese accent Chinese speech corpus (CACSC) as the first of a series of Chinese speech corpora with different accents. CACSC contains 25 Giga Bytes utterances uttered by 104 males and 100 females. The sampling was undertaken at 16KHz rate with 16bit-width data precision through a standard SoundBlaster of a personal computer under ordinary office environment. CACSC is mainly based on the standard Chinese, known as Mandarin, with light Cantonese accents. The establishment of CACSC offers a testing bed for robust speech recognition of a certain regional accent. This paper is to describe how CACSC was established and its features in details.
What problem does this paper attempt to address?