A Fast and Power Efficient Architecture to Parallelize LSTM based RNN for Cognitive Intelligence Applications.

Peng Ouyang,Shouyi Yin,Shaojun Wei
DOI: https://doi.org/10.1145/3061639.3062187
2017-01-01
Abstract:Long Short-Term Memory (LSTM) based Recurrent Neural Networks (RNNs) are promising for cognitive intelligence applications like speech recognition, image caption and nature language processing, etc. However, the cascade dependent structure in RNN with huge amount of power inefficient operations like multiplication, memory accessing and nonlinear transformation, could not guarantee high computing speed and low power consumption. In this work, by exploiting semantic correlation, we propose a semantic correlation based data pre-fetch method to break the dependency and achieve parallel processing. Based on this method, a full parallel and pipeline architecture that tackles huge amount operations is designed. Experiments on benchmarks of image caption, speech recognition and language processing show that, this work improves computing speed by 5.1 times, 44.9 times and 1.53 times, respectively, and power efficiency by 1885.7 times, 4061.5 times and 127.5 times, respectively, when compared with state-of-the-art works.
What problem does this paper attempt to address?