Dynamic Load Balancing in GPU-Based Systems - Early Experiments
Alvaro Luiz Fazenda,Celso L. Mendes,Laxmikant V. Kale,Jairo Panetta,Eduardo Rocha Rodrigues
DOI: https://doi.org/10.48550/arXiv.1310.4218
2013-10-15
Distributed, Parallel, and Cluster Computing
Abstract:The dynamic load-balancing framework in Charm++/AMPI, developed at the University of Illinois, is based on using processor virtualization to allow thread migration across processors. This framework has been successfully applied to many scientific applications in the past, such as BRAMS, NAMD, ChaNGa, and others. Most of these applications use only CPUs to perform their operations. However, the use of GPUs to improve computational performance is quickly getting massively disseminated in the high-performance computing community. This paper aims to investigate how the same Charm++/AMPI framework can be extended to balance load in a synthetic application inspired by the BRAMS numerical forecast model, running mostly on GPUs rather than on CPUs. Many major questions involving the use of GPUs with AMPI where handled in this work, including: how to measure the GPU's load, how to use and share GPUs among user-level threads, and what results are obtained when applying the mandatory over-decomposition technique to a GPU-accelerated program.