MixCIM: A Hybrid-Cell-Based Computing-in-Memory Macro with Less-Data-Movement and Activation-Memory-Reuse for Depthwise Separable Neural Networks

Xin Qiao,Jiahao Song,Youming Yang,Renjie Wei,Xiyuan Tang,Meng Li,Runsheng Wang,Yuan Wang
DOI: https://doi.org/10.1109/cicc60959.2024.10529086
2024-01-01
Abstract:Depthwise separable neural network models with fewer parameters, such as MobileNet, are more friendly to edge AI devices. They replace the standard convolution with the depthwise separable convolution, which consists of a depthwise (DW) convolution and a pointwise (PW) convolution. Most prior computing-in-memory (CIM) works [1–5] only optimize multiply-and-accumulate (MAC) operations for one of these two types. Thus, when performing depthwise separable convolution, recent SRAM-based CIMs still face limitations in energy efficiency, throughput, and memory utilization (Fig. 1): (1) low memory energy utilization due to the repetitive activations access from cache outside the CIM macro, accounting for more than 30% energy consumption; (2) poor array temporal utilization caused by the short length MAC and convolution sliding over activations, increasing inference execution cycles; (3) insufficient memory spatial utilization due to the duplication memory for input and output activations in DW convolution, causing double memory overhead.
What problem does this paper attempt to address?