Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures
Shouyi Yin,Xianqing Yao,Tianyi Lu,Dajiang Liu,Jiangyuan Gu,Leibo Liu,Shaojun Wei
DOI: https://doi.org/10.1109/tvlsi.2015.2474129
2015-01-01
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Abstract:Coarse-Grained Reconfigurable Architecture (CGRA) is a promising architecture with high performance, high power-efficiency and attraction of flexibility. The compute-intensive parts of an application (e.g. loops) are often mapped onto CGRA for acceleration. Since the high-parallel demands of PEs and the extremely expensive cost of single-bank memory with multi-port, the architecture with multi-bank memory is favored increasingly. Based on this purpose, a joint solution, which simultaneously considers modulo scheduling and data placement, is proposed to achieve a valid mapping with better performance. The experimental results on loops from Livermore, Polybench and Mediabench show that our approach can significantly improve the performance of the kernels on CGRA compared with REGIMap, HTDM and REGIMap+MP, with an acceptable increase in compilation time.