A Spatial-Designed Computing-In-Memory Architecture Based on Monolithic 3D Integration for High-Performance Systems.

Jiaming Li,Bin Gao,Ruihua Yu,Peng Yao,Jianshi Tang,He Qian,Huaqiang Wu
DOI: https://doi.org/10.1145/3611315.3633240
2023-01-01
Abstract:The computing-in-memory (CIM) technology effectively addresses the bottleneck of data movement in traditional von-Neumann architecture, especially for deep neural network (DNN) acceleration. However, with the improving performance and parallelism of CIM processing elements (PEs), the substantial latency and power overhead caused by high-density intermediate results transmission has become a new bottleneck in CIM architectures. In this paper, we propose a spatial-designed CIM architecture based on the emerging Monolithic 3D (M3D) technology, and a spatiality-aware DNN mapping method for high-performance CIM systems. The proposed architecture introduces a novel hierarchy by implementing staggered tiers, enabling PEs to be shared by multiple tiles, and uses the ultra-dense and lower-power Inter-Layer Vias (ILVs) as shared buses, enabling CIM PEs to exploit the ultra-high bandwidth of M3D for inter-tile and intra-tile data transfer. Experiment result shows that the proposed M3D-enabled CIM architecture, combined with the proposed mapping method, achieves a 6.52× latency improvement, a 40.84× interconnection energy-delay product (EDP) improvement, and a 7.62× system-level EDP improvement compared to state-of-the-art CIM architecture.
What problem does this paper attempt to address?