Parallelization with load balancing of the weather scheme WSM7 for heterogeneous CPU-GPU platforms

Thomas Jakobs,Oliver Klöckner,Gudula Rünger
DOI: https://doi.org/10.1007/s11227-024-06009-9
IF: 3.3
2024-03-23
The Journal of Supercomputing
Abstract:This article provides an enhanced parallelization of the WSM7 microphysics scheme for the Weather Research and Forecasting Model (WRF). The parallelization is designed to maximize the utilization of a heterogeneous computing system consisting of CPUs, GPUs or both. Therefore the reference implementation of the WSM7 scheme is re-implemented for the heterogeneous execution model. For each time step, a dynamic load distribution is introduced which balances the computational load between the two components aiming for an overall minimum execution time. The evaluation of the parallelized implementation is done for a specific weather situation. Specifically, the precipitation of the low-pressure zone "Bernd" from July 2021 is simulated using an Intel Core i7-7700 CPU and a NVIDIA GTX 1070 GPU. The results show a speedup of up to 28.51 for the GPU version in comparison with the reference implementation. The heterogeneous dynamic load balancing increases the speedup achieved even further by introducing a distribution factor that is updated for each time step.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?