Preemption-Aware Kernel Scheduling for GPUs

Sihuizi Jin,Zhenning Wang,Quan Chen,Minyi Guo
DOI: https://doi.org/10.1109/ispa/iucc.2017.00087
2017-01-01
Abstract:GPUs have been widely used in modern datacenters to accelerate emerging services such as Graph Processing, Intelligent Personal Assistant (IPA), and Deep Learning. However, current GPUs have very limited support for sharing. They are shared in a time-multiplexed manner in datacenters, which leads to low throughput. Previous studies on GPU kernel scheduling either target for fairness or only share GPUs statically, which cannot handle dynamically arriving kernels. Recent work has proposed hardware preemption mechanism for GPUs, enabling dynamic sharing. Exploiting this mechanism, we propose a preemption-aware kernel scheduling strategy for GPUs. Our strategy improves the throughput by running complementary kernels together. Furthermore, our strategy decides whether to preempt running kernels by weighing the performance benefit and overhead of the preemption with analytic models when new kernels arrive. Evaluation results show that our strategy improves the throughput by 20.1% over sequential execution, and 11.5% over a FCFS strategy.
What problem does this paper attempt to address?