A Multiply-Less Approximate SRAM Compute-In-Memory Macro for Neural-Network Inference

Haikang Diao,Yifan He,Xuan Li,Chen Tang,Wenbin Jia,Jinshan Yue,Haoyang Luo,Jiahao Song,Xueqing Li,Huazhong Yang,Hongyang Jia,Yongpan Liu,Yuan Wang,Xiyuan Tang
DOI: https://doi.org/10.1109/jssc.2024.3433417
IF: 5.4
2024-01-01
IEEE Journal of Solid-State Circuits
Abstract:Compute-in-memory (CIM) is promising in reducing data movement energy and providing large bandwidth for matrix-vector multiplies (MVMs). However, existing work still faces various challenges, such as the digital logic overhead caused by the multiply-add operations (OPs) and structural sparsity. This article presents a 2-to-8-b scalable approximate digital SRAM-based CIM macro co-designed with a multiply-less neural network (NN) approach. It incorporates dynamic-logic-based approximate circuits for the logic area and energy saving by eliminating multiplications. A prototype is fabricated in 28-nm CMOS technology and achieves peak multiply-accumulate (MAC)-level energy efficiency of 102 TOPS/W for 8-b operations. The NN model deployment flow is used to demonstrate CIFAR-10 and ImageNet classification with ResNet-20 and ResNet-50 style multiply-less models, respectively, achieving the accuracy of 91.74% and 74.8% with 8-bit weights and activations.
What problem does this paper attempt to address?