Long short term memory recurrent neural network acoustic models using i-vector for low resource speech recognition

Guangxu Huang,Yao Tian,Jian Kang,Jia Liu,Shanhong Xia
DOI: https://doi.org/10.3969/j.issn.1001-3695.2017.02.016
2017-01-01
Abstract:Under the condition of low resource,little labeled training data is available and the performance of speech recognition system is not ideal.To solve this problem.First,this paper investigated long short term memory recurrent neural network (LSTM RNN) for acoustic modeling.It was a powerful tool to model long time series and could make full use of the context information.Linear projection layer reduced the number of model parameters.Then,it explored speaker modeling methods in the feature space,and extracted identity vector (i-vector) which contained the speaker and channel information simultaneously.Finally,it presented a novel system,which combined the LSTM RNN model and i-vector feature.Results on the standard Open KWS 2013 data set show that this technology produces a relative improvement of about 10% in TER over the DNN baseline system.
What problem does this paper attempt to address?