Lightly supervised acoustic model training for mandarin continuous speech recognition

Xiangang Li,Zaihu Pang,Xihong Wu
DOI: https://doi.org/10.1007/978-3-642-36669-7_88
2013-01-01
Abstract:This paper investigates a kind of lightly supervised acoustic model training method for Mandarin continues speech recognition system. The speech materials with rough transcription, which provide some light supervision for acoustic model training, are available in various forms these days. In this work, the quality problem of this kind of data is classified into two types: the first is non-speech and low-quality speech in the corpora, while the second is the transcription errors. A framework is proposed to tackle these two types separately: the speech recognition with transcription-relevant language model is adopted to remove the first type, while with general language model to provide candidate transcription errors which are checked by the final automatic verification process. The performance of proposed framework was evaluated from two aspects: the data quality has significantly improved, and the speech recognition results show that a 21.88% relative CER reduction was obtained.
What problem does this paper attempt to address?