An Improved LSTM for Language Identification

Qingran Zhan,Liqiang Zhang,Hui Deng,Xiang Xie
DOI: https://doi.org/10.1109/icsp.2018.8652445
2018-01-01
Abstract:In this paper, we propose a novel framework by combining the phonetic temporal neural model (PTN) with an improved LSTM (IM-LSTM). This is achieved by using an up-down connection from the time t to t+1 in the LSTM structure, which aims to capture the latent information from the previous time step. This updated structure can perform better to discriminate the frame-level phonetic information produced by PTN. On the AP16-OLR language identification dataset, our final model achieves relative growth rate 5.04%, 2.19%, 2.73% on EER and 6.55%, 5.81%, 2.23% on C-avg in 1s, 3s and fulllength utterance condition than the standard PTN, respectively. The proposed framework receives a better performance than the standard PTN and other proposed models, particularly in 1s condition. This shows the efficacy and flexibility of the proposed method.
What problem does this paper attempt to address?