Scalable GPU Virtualization with Dynamic Sharing of Graphics Memory Space
Mochi Xue,Kun Tian,Yaozu Dong,Jiacheng Ma,Jiajun Wang,Zhengwei Qi,Bingsheng He,Haibing Guan
DOI: https://doi.org/10.1109/tpds.2018.2789883
IF: 5.3
2018-01-01
IEEE Transactions on Parallel and Distributed Systems
Abstract:With increasing GPU-intensive workloads deployed on cloud, the cloud service providers are seeking for practical and efficient GPU virtualization solutions. However, the cutting-edge GPU virtualization techniques such as gVirt still suffer from the restriction of scalability, which constrains the number of guest virtual GPU instances. This paper introduces gScale, a scalable GPU virtualization solution. By taking advantage of the GPU programming model, gScale presents a dynamic sharing mechanism which combines partition and sharing together to break the hardware limitation of global graphics memory space. Particularly, we propose three approaches for gScale: (1) the private shadow graphics translation table, which enables global graphics memory space sharing among virtual GPU instances, (2) ladder mapping and fence memory space pool, which allows the CPU to access host physical memory space (serving the graphics memory) bypassing global graphics memory space, (3) slot sharing, which improves the performance of vGPU under a high density of instances. The evaluation shows that gScale scales up to 15 guest virtual GPU instances in Linux or 12 guest virtual GPU instances in Windows, which is 5x and 4x scalability, respectively, compared to gVirt. At the same time, gScale incurs a slight runtime overhead on the performance of gVirt when hosting multiple virtual GPU instances.