A General Procedure for Improving Language Models in Low-Resource Speech Recognition

Qian Liu,Wei-Qiang Zhang,Jia Liu,Yao Liu
DOI: https://doi.org/10.1109/IALP48816.2019.9037726
2019-01-01
Abstract:It is difficult for a language model (LM) to perform well with limited in-domain transcripts in low-resource speech recognition. In this paper, we mainly summarize and extend some effective methods to make the most of the out-of-domain data to improve LMs. These methods include data selection, vocabulary expansion, lexicon augmentation, multi-model fusion and so on. The methods are integrated into a systematic procedure, which proves to be effective for improving both n-gram and neural network LMs. Additionally, pre-trained word vectors using out-of-domain data are utilized to improve the performance of RNN/LSTM LMs for rescoring first-pass decoding results. Experiments on five Asian languages from Babel Build Packs show that, after improving LMs, 5.4-7.6% relative reduction of word error rate (WER) is generally achieved compared to the baseline ASR systems. For some languages, we achieve lower WER than newly published results on the same data sets.
What problem does this paper attempt to address?