GiantVM: A Novel Distributed Hypervisor for Resource Aggregation with DSM-aware Optimizations

Xingguo Jia,Jin Zhang,Boshi Yu,Xingyue Qian,Zhengwei Qi,Haibing Guan
DOI: https://doi.org/10.1145/3505251
IF: 1.444
2022-01-01
ACM Transactions on Architecture and Code Optimization
Abstract:We present GiantVM,1 an open-source distributed hypervisor that provides the many-to-one virtualization to aggregate resources from multiple physical machines. We propose techniques to enable distributed CPU and I/O virtualization and distributed shared memory (DSM) to achieve memory aggregation. GiantVM is implemented based on the state-of-the-art type-II hypervisor QEMU-KVM, and it can currently host conventional OSes such as Linux. (1) We identify the performance bottleneck of GiantVM to be DSM, through a top-down performance analysis. Although GiantVM offers great opportunities for CPU-intensive applications to enjoy the aggregated CPU resources, memory-intensive applications could suffer from cross-node page sharing, which requires frequent DSM involvement and leads to performance collapse. We design the guest-level thread scheduler, DaS (DSM-aware Scheduler), to overcome the bottleneck. When benchmarking with NAS Parallel Benchmarks, the DaS could achieve a performance boost of up to 3.5×, compared to the default Linux kernel scheduler. (2) While evaluating DaS, we observe the advantage of GiantVM as a resource reallocation facility. Thanks to the SSI abstraction of GiantVM, migration could be done by guest-level scheduling. DSM allows standby pages in the migration destination, which need not be transferred through the network. The saved network bandwidth is 68% on average, compared to VM live migration. Resource reallocation with GiantVM increases the overall CPU utilization by 14.3% in a co-location experiment.
What problem does this paper attempt to address?