A 3d Multi-Layer Cmos-Rram Accelerator for Neural Network

Hantao Huang,Leibin Ni,Yuhao Wang,Hao Yu,Zongwei Wang,Yimao Cai,Ru Huang
DOI: https://doi.org/10.1109/3dic.2016.7970014
2016-01-01
Abstract:Incremental machine learning is required for future real-time data analytics. This paper introduces a 3D multi layer CMOS-RRAM accelerator for an incremental least-squares based learning on neural network. Given input of buffered data hold on the layer of a RRAM memory, intensive matrix-vector multiplication can be firstly accelerated on the layer of a digitized RRAM-crossbar. The remaining incremental least-squares algorithmic operations for feature extraction and classifier training can be accelerated on the layer of CMOS ASIC, using an incremental Cholesky factorization accelerator realized with consideration of parallelism and pipeline. Experiment results have shown that such a 3D accelerator can significantly reduce training time with acceptable accuracy. Compared to 3D-CMOS-ASIC implementation, it can achieve 1.28x smaller area, 2.05x faster runtime and 12.4x energy reduction. Compared to GPU implementation, our work shows 3.07x speed-up and 162.86x energy-saving.
What problem does this paper attempt to address?