Dynamic temporal residual learning for speech recognition

Jiaqi Xie,Ruijie Yan,Shanyu Xiao,Liangrui Peng,Michael T. Johnson,Wei-Qiang Zhang
DOI: https://doi.org/10.1109/icassp40776.2020.9054653
2020-01-01
Abstract:Long short-term memory (LSTM) networks have been widely used in automatic speech recognition (ASR). This paper proposes a novel dynamic temporal residual learning mechanism for LSTM networks to better explore temporal dependencies in sequential data. The temporal residual learning mechanism is implemented by applying shortcut connections with dynamic weights to temporally adjacent LSTM outputs. Two types of dynamic weight generation methods are proposed: using a secondary network and using a random weight generator. Experimental results on Wall Street Journal (WSJ) speech recognition dataset reveal that our proposed methods have surpassed the baseline LSTM network.
What problem does this paper attempt to address?