S2D-CIM: A 22nm 128kb Systolic Digital Compute-in-Memory Macro with Domino Data Path for Flexible Vector Operation and 2-D Weight Update in Edge AI Applications

Meng Wu,Wenjie Ren,Peiyu Chen,Wentao Zhao,Yiqi Jing,Jiayoon Ru,Zhixuan Wang,Yufei Ma,Ru Huang,Tianyu Jia,Le Ye
DOI: https://doi.org/10.1109/cicc60959.2024.10529046
2024-01-01
Abstract:Digital compute-in-memory (DCIM) has advantages of performing robust and efficient high-precision, e.g. floating-point, multiplication and accumulation operations (MACs), compared to analog CIM solutions. Prior DCIMs [1–4] normally adopt dataflow with broadcast inputs and stationary weights, which can only obtain peak energy efficiency (EE) and area efficiency (AE) at full utilization. However, the NN model sizes often mismatch with the fixed CIM macro size in practical applications, leading to unavoidable degradation of utilization and efficiency. As shown in Fig. 1, the low utilization issue, e.g. 49.8% for YOLO-v7, becomes more pronounced when it comes to lightweight edge AI models, e.g. 17.6% for EfficientNet-lite4. The low utilization can also incur energy wastage from the unused macro circuits. Furthermore, the large amount of weight updates during DCIM weight preloading or reloading can degrade the system-level EE and AE, which is often overlooked. Although [3–4] hides weight update with simultaneous MAC, the growing model size requires more frequent weight reloading, leading to non-static-weight compute in DCIM.
What problem does this paper attempt to address?