Ifpna: A Flexible and Efficient Deep Neural Network Accelerator with a Programmable Data Flow Engine in 28nm CMOS.

Chixiao Chen,Xindi Liu,Huwan Peng,Hongwei Ding,C. -J. Richard Shi
DOI: https://doi.org/10.1109/esscirc.2018.8494327
2018-01-01
Abstract:The paper presents iFPNA, instruction-and-fabric programmable neuron array: a general-purpose deep learning accelerator that achieves both energy efficiency and flexibility. The iFPNA has a programmable data flow engine with a custom instruction set, and 16 configurable neuron slices for parallel neuron operations of different bit-widths. Convolutional neural networks of different kernel sizes are implemented by choosing data flows among input stationary, row stationary and tunnel stationary, etc. Recurrent neural networks with element-wise operations are implemented by a universal activation engine. Measurement results show that the iFPNA achieves a peak energy efficiency of 1.72 TOPS/W running at 30 MHz clock rate and 0.63 V voltage supply. The measured latency on AlexNet is 60.8 ms and on LSTM-512 is 40 ms at 125 MHz clock rate.
What problem does this paper attempt to address?