Small Area Configurable Deep Neural Network Accelerator for IoT System

Liangkai Zhao,Ning Wu,Fen Ge,Fang Zhou,Jiahui Zhang,Tong Lu
DOI: https://doi.org/10.1109/icct50939.2020.9295774
2020-01-01
Abstract:With the development of Internet of things (IoT) technology, deep learning algorithm has been widely used in IoT devices. Convolutional neural network (CNN) and recurrent neural network (RNN) play a significant role in image field and sequence data respectively. In order to enable the IoT terminal SoC with limited computing power and resources to support CNN and RNN algorithms, this paper proposes a small area configurable deep neural network accelerator with fixed-point accuracy. The main computing components of CNN and RNN are implemented by hardware. Each computing module completes the calculation of complex neural network through parameter configuration and combination. In order to verify the performance of the accelerator, a SoC verification system based on Cortex-M3 was constructed. The lenet-5 network and the Long Short-Term Memory (LSTM) network with 2 layers and 128 hidden units are implemented on this accelerator, and the execution time of the two networks on Intel i5 7500, Cortex-A53 and Cortex-A7 processors is compared. The comparison results show that the CNN and RNN computing power of the accelerator exceeds that of Cortex-A53 and Cortex-A7 at the main frequency of 50MHz.
What problem does this paper attempt to address?