Optimizing Cache Bypassing and Warp Scheduling for GPUs

Yun Liang,Xiaolong Xie,Yu Wang,Guangyu Sun,Tao Wang
DOI: https://doi.org/10.1109/TCAD.2017.2764886
IF: 2.9
2018-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:The massive parallel architecture enables graphics processing units (GPUs) to boost performance for a wide range of applications. Initially, GPUs only employ scratchpad memory as on-chip memory. Recently, to broaden the scope of applications that can be accelerated by GPUs, GPU vendors have used caches as on-chip memory in the new generations of GPUs. Unfortunately, GPU caches face many performanc...
What problem does this paper attempt to address?