Just-in-time Latent Semantic Adaptation on Language Model for Chinese Speech Recognition Using Web Data

Qin Gao,Xiaojun Lin,Xihong Wu
DOI: https://doi.org/10.1109/slt.2006.326814
2006-01-01
Abstract:A novel method is proposed, which is for performing just-in- time adaptation on language models in Chinese speech recognition using Web search engines. Latent semantic analysis (LSA) is employed to change the probability distribution of N-gram language model. The method has two advantages. First, it needs relatively small amount of data which can be obtained from Web on-the-fly. Second, comparing to traditional adaptation formula of LSA, the proposed approach is more efficient, which ensures second pass decoding to be performed with high speed. Experiments show that the perplexity of language model is reduced by over 13% after adaptation. A 4.29% relative reduction on WER is achieved in large vocabulary Chinese speech recognition over standard test set.
What problem does this paper attempt to address?