Data parallelism optimization for the CGRA loop pipelining mapping

YANG Zi-Yu,Ming Yan,WANG Da-Wei,LI Si-Kun
DOI: https://doi.org/10.3724/SP.J.1016.2013.01280
2013-01-01
Jisuanji Xuebao/Chinese Journal of Computers
Abstract:Loop kernels of data-intensive applications always consume much execution time of the whole program. However, mapping loop kernels efficiently onto CGRA (Coarse-Grained Reconfigurable Architecture) is still a hard problem for researchers in recent years. In order to maximize application parallelism and hardware resource utilization while minimize memory cost, a data parallelism method MDP for CGRA loop pipelining is proposed. With a novel reconfigurable computing model named TMGC2, loops can exploit both pipelining within CGRA pipelines and parallelism between pipelines. A memory bank conflict algorithm is proposed to reorganize the data. With data reuse graph, MDP can significantly affect performance through data parallelism. The experimental results show that the proposed approach makes LKPM gains more resource utilization by 41.3% and more throughputs by 37.2% times than previous method without optimization.
What problem does this paper attempt to address?