GAugur
Yusen Li,Chuxu Shan,Ruobing Chen,Xueyan Tang,Wentong Cai,Shanjiang Tang,Xiaoguang Liu,Gang Wang,Xiaoli Gong,Ying Zhang
DOI: https://doi.org/10.1145/3307681.3325409
2019-01-01
Abstract:Cloud gaming has been very popular recently, but providing satisfactory gaming experiences to players at a modest cost is still challenging. Colocating several games onto one server could improve server utilization. To enable efficient colocations while providing Quality of Service (QoS) guarantees, a precise quantification of performance interference among colocated games is required. However, achieving such precise interference prediction is very challenging for games due to the complexity introduced by the contention on many shared resources across CPU and GPU. Moreover, the distinctive properties of cloud gaming require that the prediction model should be constructed beforehand and the prediction should be made instantaneously at request arrivals, which further increases the difficulty. The existing solutions are either not applicable or not effective due to many limitations. In this paper, we present GAugur, a novel methodology that enables highly accurate prediction of the performance interference among games arbitrarily colocated. By leveraging machine learning technologies, GAugur is able to capture the complex relationship between the interference and the contention features of colocated games. We evaluate GAugur through extensive experiments using a large number of real popular games. The results show that GAugur is able to identify whether a colocated game satisfies QoS requirement within an average error of 5%, and is able to quantify the performance degradation of a colocated game within an average error of 7.9%, which significantly outperforms the alternatives. Moreover, GAugur incurs an offline profiling cost linear to the number of games, and negligible overhead for online prediction. We apply GAugur to guiding efficient game colocations for cloud gaming. Experimental results show that GAugur is able to increase the resource utilization by 20% to 60%, and improve the overall performance by up to 15%, compared to the state-of-the-art solutions.