Improved Phonotactic Language Recognition Based on RNN Feature Reconstruction

Wei-Wei Liu,Wei-Qiang Zhang,Yongzhe Shi,An Ji,Jiaming Xu,Jia Liu
DOI: https://doi.org/10.1109/icassp.2014.6854619
2014-01-01
ICASSP
Abstract:Nowadays phone recognition followed by support vector machine (PR-SVM) has been proposed in language recognition tasks and shown encouraging results. However, it still suffers from the problems such as the curse of dimensionality led by the increasing order of the N-gram feature supervector, the fast increasing number of possible parameters because of fast exact match of the phoneme history, etc. These problems hamper the capability of N-gram vector space model (VSM) of handling long-term contexts. In this paper, a recurrent neural networks (RNN) based feature reconstruction (FR) method is presented to compensate for the deficiency of the N-grams feature for phonotactic language recognition in this paper. Experiments are implemented on 2009 National Institute of Standards and Technology language recognition evaluation (NIST LRE) database. The results show that the proposed method gives 8.76%, 3.82%, 11.93% relative error rate reduction for 30s, 10s, 3s respectively comparing with the baseline system.
What problem does this paper attempt to address?