EMS: Efficient Memory Subsystem Synthesis for Spatial Accelerators

Liancheng Jia,Yuyue Wang,Jingwen Leng,Yun Liang
DOI: https://doi.org/10.1145/3489517.3530411
2022-01-01
Abstract:Spatial accelerators provide massive parallelism with an array of homogeneous PEs, and enable efficient data reuse with PE array dataflow and on-chip memory. Many previous works have studied the dataflow architecture of spatial accelerators, including performance analysis and automatic generation. However, existing accelerator generators fail to exploit the entire memory-level reuse opportunities, and generate suboptimal designs with data duplication and inefficient interconnection. In this paper, we propose EMS, an efficient memory subsystem synthesis and optimization framework for spatial accelerators. We first use space-time transformation (STT) to analyze both PE-level and memory-level data reuse. Based on the reuse analysis, we develop an algorithm to automatically generate data layout of the multi-banked scratchpad memory, data mapping, and access controller for the memory. Our generated memory subsystem supports multiple PE-memory interconnection topologies including direct, multicast, and rotated connection. The memory and interconnection generation approach can efficiently utilize the memory-level reuse to avoid duplicated data storage with low hardware cost. EMS can automatically synthesize tensor algebra to hardware designed in Chisel. Experiments show that our proposed memory generator reduces the on-chip memory size by an average of 28% than the state-of-the-art, and achieves comparable hardware performance.
What problem does this paper attempt to address?