New YARN sharing GPU based on graphics memory granularity scheduling

Jinliang Shi,Dewu Chen,Jiabi Liang,Lin Li,Yue Lin,Jianjiang Li
DOI: https://doi.org/10.1016/j.parco.2023.103038
IF: 0.983
2023-07-22
Parallel Computing
Abstract:As one of the most widely used cluster scheduling frameworks, Hadoop YARN only supported CPU and memory scheduling in the past. Furthermore, due to the widespread use of AI, the demand for GPU is also increasing. So Hadoop YARN V3.0 adds GPU scheduling, but the granularity is on the whole card yet, rather than finer-grained graphics memory scheduling. However, during daily training, although the graphics memory required by tasks may be much smaller than the whole GPU card, they will occupy the whole card, which results in wasted resources. To address this issue, Tensorflow provides the API for graphics memory control. Therefore, we propose to introduce this feature into Hadoop YARN so that it can support the heterogeneous scheduling: CPU, memory and graphics memory. Then we take HadoopV2.7 source code as the underlying architecture and design a new scheduler GSHARE. Compared with previous scheduling strategies, with 3 nodes, 3 GPU cards per node, and 12G graphics memory per card, GSHARE improves efficiency by up to 74% for Tensorflow tasks with 2G of graphics memory. Meanwhile, it minimizes the problem of wasted graphics memory caused by the inability to control graphics memory proportionally by the API of Tensorflow for multiple-card.
computer science, theory & methods
What problem does this paper attempt to address?