Run-Time Performance Estimation and Fairness-Oriented Scheduling Policy for Concurrent GPGPU Applications

Qingda Hu,Jiwu Shu,Jie Fan,Youyou Lu
DOI: https://doi.org/10.1109/icpp.2016.14
2016-01-01
Abstract:In order to satisfy the competition of multiple GPU-accelerated applications and make full use of GPU resources, a lot of previous works propose spatial-multitasking to execute multiple GPGPU applications simultaneously on a single GPU device. However, when adopting the spatial-multitasking framework, the inter-application interference may slow down different applications differently, leading to the unreasonable allocation of shared resources among concurrent GPGPU applications, degrading system fairness severely and resulting in sub-optimal performance. Thus, it is imperative to develop mechanisms to control negative inter-application interactions and utilize shared resources fairly and efficiently.Quantitatively estimating application slowdowns can enable us to accurately minimize system unfairness. Although several previous works pay attention on showdown estimation for CPUs, we find that they may be inaccurate for GPUs. Therefore, we propose a novel Dynamical Application Slowdown Estimation (DASE) model to estimate application slowdowns accurately. Our evaluations show that DASE has significantly lower estimation error (only 8.8%) than the state-of-the-art estimation models (36.3% and 32.8%) across all two-application workloads. Furthermore, to verify the effectiveness of our DASE model, we leverage our model to develop an efficient fairness-oriented Streaming Multiprocessors (SM) allocation policy DASE-Fair to minimize the overall system unfairness. Compared to the even SM partition policy, DASE-Fair improves fairness dramatically by more than 16.1% on average.
What problem does this paper attempt to address?