DSP Based Acceleration for Long Short-Term Memory Model Based Word Prediction Application

Keqian Zhu,Jingfei Jiang
DOI: https://doi.org/10.1109/icicta.2017.28
2017-01-01
Abstract:Neural network based deep learning algorithm is a study hotspot in artificial intelligence. Moreover, embedded artificial intelligence and mobile computing are becoming more and more important in industry. For these applications, not only high performance computing is required, but also low power consumption restriction cannot be ignored. DSP has special hardware architecture with characteristics of high performance and low power consumption, which is an ideal computing platform for embedded artificial intelligence. This paper concerns the energy efficiency of DSP under deep learning applications. However, many relative researches are insufficient in application scale and optimization techniques. This research extends application scale and puts forward some optimization methods in detail. Specifically, for the Long Short-Term Memory (LSTM) model based word prediction application, we use TI's high-performance multi-core DSP to accelerate its inference process. We apply a variety of optimization techniques to our initial DSP program. Relative experimental results show that these techniques bring notable performance improvement. Furthermore, we regard the MATLAB program which runs on general CPU and C program which runs on ARM as contrast. In terms of performance and power ratio, DSP is 7.79 times over general CPU and 2.28 times over ARM, which indicates that DSP is a suitable platform for embedded artificial intelligence.
What problem does this paper attempt to address?