CDPM: Context-Directed Pattern Matching Prefetching to Improve Coarse-Grained Reconfigurable Array Performance.

Leibo Liu,Chen Yang,Shouyi Yin,Shaojun Wei
DOI: https://doi.org/10.1109/tcad.2017.2748026
IF: 2.9
2017-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:Coarse-grained reconfigurable arrays (CGRAs) can be dynamically programmed by configuration contexts to concurrently run multiple operations on a processing elements array. This further widens the gap between off-chip memory bandwidth demand and the limited speed of off-chip memory access. Cache prefetching is widely used for mitigating off-chip memory latency. However, straightforwardly applying existing prefetching techniques (primarily focusing on instruction driven processors) to CGRA may induce inaccurate prefetching, thereby crippling CGRA performance. Based on repetitively executed context in CGRA computing, this paper proposes a context-directed pattern matching (CDPM) mechanism to improve prefetching accuracy for CGRAs. CDPM generates a prefetch pattern for an initially executed context, and then reuses the pattern to issue prefetch requests when the context is re-executed. In order to eliminate the outdated prefetch pattern, CDPM also evaluates the prefetching accuracy of the prefetch pattern at run-time by adding prefetch addresses to a Bloom filter. The distinguishing feature of CDPM is the employment of the CGRA configuration context as a guide to improving prefetching accuracy. Experimental results showed that CDPM prefetching averagely improved performance by 31.1% compared to tests without prefetching and by 7.7% compared to state-of-the-art cache prefetching techniques, while only incurring slight area and power overheads.
What problem does this paper attempt to address?