Quality of Service Support for Fine-Grained Sharing on GPUs.

Zhenning Wang,Jun Yang,Rami Melhem,Bruce Childers,Youtao Zhang,Minyi Guo
DOI: https://doi.org/10.1145/3079856.3080203
2017-01-01
ACM SIGARCH Computer Architecture News
Abstract:GPUs have been widely adopted in data centers to provide acceleration services to many applications. Sharing a GPU is increasingly important for better processing throughput and energy efficiency. However, quality of service (QoS) among concurrent applications is minimally supported. Previous efforts are too coarse-grained and not scalable with increasing QoS requirements. We propose QoS mechanisms for a fine-grained form of GPU sharing. Our QoS support can provide control over the progress of kernels on a per cycle basis and the amount of thread-level parallelism of each kernel. Due to accurate resource management, our QoS support has significantly better scalability compared with previous best efforts. Evaluations show that, when the GPU is shared by three kernels, two of which have QoS goals, the proposed techniques achieve QoS goals 43.8% more often than previous techniques and have 20.5% higher throughput.
What problem does this paper attempt to address?