An Acoustic Model for English Speech Recognition Based on Deep Learning

Zhang Ling
DOI: https://doi.org/10.1109/icmtma.2019.00140
2019-04-01
Abstract:As one of the core modules of English speech recognition system, the performance of acoustic model directly affects the recognition effect of the final system. Aiming at the problems of acoustic modeling, model optimization and training efficiency in English speech recognition system, this paper focuses on the acoustic modeling method based on deep learning technology. By analyzing the basic theory of deep neural network and HMM, the structure and parameter configuration of DNN-HMM are intensively studied. A new hybrid network model is proposed by using clustered state instead of single factor state as the output unit of the neural network, and it is applied to the acoustic model of English speech recognition. The simulation results show that the improved scheme can get better recognition effect by increasing the enhancement of speech features under the condition of three-phoneme structure. In addition, the experimental results also verify that the acoustic modeling method based on DNN-HMM is superior to traditional GMM-HMM method.
What problem does this paper attempt to address?