Exploring the Diversity of Multiple Job Deployments over GPUs for Efficient Resource Sharing

Yoonhee Kim,Jiwon Ha,Theodora Adufu
DOI: https://doi.org/10.1109/ICOIN59985.2024.10572198
2024-01-17
Abstract:Graphic Processing Units (GPUs) are gradually becoming mainstream computing resource for efficient execution of applications both on-premises and in the cloud. Currently however, most HPC applications are unable to leverage the large computing capabilities they provide leading to issues of resource under-utilization. Various GPU sharing approaches have been proposed which leverage either software or hardware level mechanisms like MPS or MIG in NVIDIA GPUs. However, combining both the software and hardware level technologies in an effort to mitigate resource under-utilization issues is yet to be fully explored. In this paper, we conduct a case study on scheduling memory intensive and compute intensive applications on an NVIDIA A30 GPU. We compare the performance when using only hardware level sharing mechanisms and when using both hardware and software level mechanisms. We observed that by combining both mechanisms, we improved total execution times by up to 14% for a single run whilst improving peak bandwidth utilization by about 39% for SCAN application.
Computer Science,Engineering
What problem does this paper attempt to address?