Mixed-granularity Parallel Coarse-Grained Reconfigurable Architecture

Jinyi Deng,Linyun Zhang,Lei Wang,Jiawei Liu,Kexiang Deng,Shibin Tang,Jiangyuan Gu,Boxiao Han,Fei Xu,Leibo Liu,Shaojun Wei,Shouyi Yin
DOI: https://doi.org/10.1145/3489517.3530454
2022-01-01
Abstract:Coarse-Grained Reconfigurable Architecture (CGRA) is a highperformance computing architecture. However, existing CGRA silicon utilization is low due to the lack of fine-grained parallelism inside Processing Element (PE) and general coarse-grained parallel approach on PE array. No fine-grained parallelism in PE not only leads to low silicon utilization of PE, but also makes the mapping loose and irregular. No generalized parallel method for the mapping cause low PE utilization on CGRA. Our goal is to design an execution model and a Mixed-granularity Parallel CGRA (MP-CGRA), which is capable to fine-grained parallelize operators excution in PEs and parallelize data transmission in channels, leading to a compact mapping. A coarse-grained general parallel method is proposed to vectorize the compact mapping. Evaluated with Machsuite, MPCGRA achieves an improvement of 104.65% silicon utilization on PE array and a 91.40% performance per area improvement compared with baseline-CGRA.
What problem does this paper attempt to address?