FDRA: A Framework for a Dynamically Reconfigurable Accelerator Supporting Multi-Level Parallelism
Yunhui Qiu,Yiqing Mao,Xuchen Gao,Sichao Chen,Jiangnan Li,Wenbo Yin,Lingli Wang
DOI: https://doi.org/10.1145/3614224
IF: 2.837
2024-01-01
ACM Transactions on Reconfigurable Technology and Systems
Abstract:Coarse-grained reconfigurable architectures (CGRAs) have emerged as promising accelerators due to their high flexibility and energy efficiency. However, existing open source works often lack integration of CGRAs with CPU systems and corresponding toolchains. Moreover, there is rare support for the accelerator instruction pipelining to overlap data communication, computation, and configuration across multiple tasks. In this article, we propose FDRA, an open source exploration framework for a heterogeneous system-on-chip (SoC) with a RISC-V processor and a dynamically reconfigurable accelerator (DRA) supporting loop, instruction, and task levels of parallelism. FDRA encompasses parameterized SoC modeling, Verilog generation, source-to-source application code transformation using frontend and DRA compilers, SoC simulation, and FPGA prototyping. FDRA incorporates the extraction of periodic accumulative operators and multi-dimensional linear load/store operators from nested loops. The DRA enables accessing the shared L2 cache with virtual addresses and supports direct memory access with arbitrary start addresses and data lengths. Integrated into the RISC-V Rocket SoC, our DRA achieves a remarkable 55× acceleration for loop kernels and improves energy efficiency by 29×. Compared to state-of-the-art RISC-V vector units, our DRA demonstrates a 2.9× speed improvement and 3.5× greater energy efficiency. In contrast to previous CGRA+RISC-V SoCs, our SoC achieves a minimum speedup of 5.2×.