RACB: Resource Aware Cache Bypass on GPUs

Hongwen Dai,Christos Kartsaklis,Chao Li,Tomislav Janjusic,Huiyang Zhou
DOI: https://doi.org/10.1109/SBAC-PADW.2014.14
2014-01-01
Abstract:Caches are universally used in computing systems to hide long off-chip memory access latencies. Unlike CPUs, massive threads running simultaneously on GPUs bring a tremendous pressure on memory hierarchy. As a result, the limitation of cache resources becomes a bottleneck for a GPU to exploit thread-level parallelism (TLP) and memory-level parallelism (MLP) and achieve high performance. In this paper, we propose a mechanism to bypass L1D and L2 cache based on the availability of cache resources. Our proposed mechanism is based on the observation that a huge number of stalls coming from limited cache resources prohibit GPUs from providing a higher throughput. So we propose Resource Aware Cache Bypass (RACB) with minor hardware changes to eliminate such stalls to improve performance. We examine the effectiveness of this approach when applied to L1D and L2 cache separately as well as together. Evaluation results with NVIDIA Computing SDK show that RACB generally improves performance the most when applied to both L1D and L2 cache, which is up to 88.05% and on an average of 16.73%, additionally, energy is saved up to 22.35% and on an average of 5.88% with minor hardware overheads.
What problem does this paper attempt to address?