Distant-Talking Accent Recognition by Combining Gmm and Dnn

Khomdet Phapatanaburi,Longbiao Wang,Ryota Sakagami,Zhaofeng Zhang,Ximin Li,Masahiro Iwahashi
DOI: https://doi.org/10.1007/s11042-015-2935-4
IF: 2.577
2015-01-01
Multimedia Tools and Applications
Abstract:Recently, automatic accent recognition has been paid more and more attentions. However, there are few researches focusing on accent recognition in distant-talking environment which is very important for improving distant-talking speech recognition performance with non-native accents. In this paper, we apply Gaussian Mixture Models (GMM) and Deep Neural Network (DNN) to identify the speaker accent in reverberant environments. The combination of likelihood with these two approaches is also proposed. In reverberant environment, the accent recognition rate was improved from 90.7 % with GMM to 93.0 % with DNN. The combination of GMM and DNN achieved recognition rate of 97.5 %, which outperformed than the individual GMM and DNN because the complementation of GMM and DNN. The relative error reduction is 73.1 % than the GMM-based method and 64.3 % than the DNN-based method, respectively.
What problem does this paper attempt to address?