A Load Balance Scheduling Approach for Generative AI on Cloud-Native Environments with Heterogeneous Resources

Chen-Kai Chun,Kuan-Chou Lai
DOI: https://doi.org/10.1109/ICASI60819.2024.10547947
2024-04-17
Abstract:Recently, the study on generative AI has become a trend. Applications of generative AI are becoming increasingly popular. Using generative AI to generate images or videos from text or image is a hot application, for example, stable diffusion. However, such approaches trend to miss the scheduling of multiple resources in the cloud-native environment with heterogeneous resources. Therefore, this work proposes an improved scheduling mechanism based on the dynamic programming approach to enhance the load balance and resource utilization in the cloud-native environment with heterogeneous resources. In general, the proposed approach outperforms some native GPU scheduling strategies and the algorithm for the multidimensional knapsacks problem. Experimental results show that the load balance could be improved up to 64.1%, and the makespan could be shortened by up to 40.8%.
Computer Science,Engineering
What problem does this paper attempt to address?