Efficient Sharing and Fine-Grained Scheduling of Virtualized GPU Resources

Xiaohui Zhao,Jianguo Yao,Ping Gao,Haibing Guan
DOI: https://doi.org/10.1109/ICDCS.2018.00077
2018-01-01
Abstract:Graphics Processing Unit (GPU) provides acceleration services to many applications, such as AI, games, media transcoding, etc. Virtualization on GPU is an enabling technology which facilitates the hardware resource sharing among multiple virtual machines (VMs). Sharing a GPU not only brings pros such as high utilization but also introduces cons such as resource contention and performance degradation. Although the existing GPU scheduling policies have been to some extent optimized, there are still some deficiencies, such as inefficient GPU sharing among multiple VMs, and high overhead within VM switching. As a result, the performance of GPU virtualization is limited by the current design, which lacks fine-grained scheduling supports. In this paper, we propose the Fine-grained schEduLing of vIrtualized gPu rEsources (FELIPE) to fully utilize and efficiently share a physical GPU among multiple VMs. To this end, we achieve the FELIPE optimization by introducing fine-grained scheduling mechanisms for virtualized GPU resources in three aspects: 1) We design a mixed time/event-based scheduling policy to reduce the idle time within VM switching. 2) We create a seamless VM assignment process, which enables VMs to switch seamlessly by stages. 3) We develop a hybrid per-ring/VM scheduling strategy, which schedules workloads to different GPU engines to run simultaneously. Then we implement the FELIPE with Intel Graphics Virtualization Technology for shared vGPU technology (GVT-g). Finally, the experimental evaluations show that the performance of the first two scheduling policies can respectively achieve up to 21.5% and 19.7% improvement, and the last one can improve the performance from 57.9% to 98.5% compared with the native design for two virtual machines.
What problem does this paper attempt to address?