D-NAT: Data-Driven Non-Ideality Aware Training Framework for Fabricated Computing-In-Memory Macros

Ming-Guang Lin,Chi-Tse Huang,Yu-Chuan Chuang,Yi-Ta Chen,Ying-Tuan Hsu,Yu-Kai Chen,Jyun-Jhe Chou,Tsung-Te Liu,Chi-Sheng Shih,An-Yeu Wu
DOI: https://doi.org/10.1109/jetcas.2022.3171268
IF: 5.877
2022-01-01
IEEE Journal on Emerging and Selected Topics in Circuits and Systems
Abstract:To enable energy-efficient computation for deep neural networks (DNNs) at edge, computing-in-memory (CIM) is proposed to reduce the energy costs during intense off-chip memory access. However, CIM is prone to multiply-accumulate (MAC) errors due to non-idealities of memory crossbars and peripheral circuits, which severely degrade the accuracy of DNNs. In this work, we propose a Data-Driven Non-ideality Aware Training (D-NAT) framework to compensate for the accuracy degradation. The proposed D-NAT framework has the following contributions: 1) We measured a fabricated SRAM-based CIM macro to obtain a data-driven MAC error model (D-MAC-EM). Based on the derived D-MAC-EM, we analyze the impact of the non-idealities on DNN’s accuracy. 2) To make DNNs robust to the non-idealities of CIM macros, we incorporate the measured D-MAC-EM into DNN’s training procedure. Moreover, we propose a statistical training mechanism to better estimate the gradients of the discrete D-MAC-EM. 3) We investigate trade-offs between quantization range and quantization errors. To mitigate the quantization errors in activations, we introduce extended PACT (E-PACT) that adaptively learns the upper and lower bounds of input activations for each layer. Simulation results show that our proposed D-NAT improves the accuracy of ResNet20, VGG8, ResNet34, and VGG16 by 78.98%, 71.8%, 72.04%, and 57.85%, respectively, which reaches the ideal upper bound of the quantized model. Lastly, the D-NAT framework is validated on an FPGA platform with the fabricated SRAM-based CIM macro chip. Based on the measurement results, D-NAT successfully recovers the accuracy under non-idealities of a real SRAM-based CIM macro.
engineering, electrical & electronic
What problem does this paper attempt to address?