GPU acceleration of MPAS microphysics WSM6 using OpenACC directives: Performance and verification

Jae Youp Kim,Ji-Sun Kang,Minsu Joh
DOI: https://doi.org/10.1016/j.cageo.2020.104627
2021-01-01
Abstract:We have attempted to accelerate a microphysics scheme embedded within a next generation climate/weather numerical model, the Model for Prediction Across Scales (MPAS), using OpenACC directives. As one of the most time-consuming physics parameterization schemes, we have focused on parallelizing the Weather Research and Forecasting (WRF) single-moment 6-class microphysics scheme (WSM6) onto a Graphics Processing Unit (GPU). We applied several essential methodologies to optimize the performance of WSM6 computation on GPU, so as to minimize data transfer between the Central Processing Unit (CPU) and GPU, and to reduce the waste of GPU threads during computation. As a result, we achieved GPU runs using one Tesla V100 that were on average 4.29 times faster than 20 CPU core Message Passing Interface (MPI) runs, including I/O communication between the CPU and GPU. When porting the whole model onto the GPU, then we achieved x10.44 speedup of WSM6 computation, allowing us to measure the acceleration of WSM6 without I/O communication. In addition, we developed a precise verification method to distinguish nonlinear chaotic error growth from differences introduced by GPU computation, taking account of the characteristics of the major output variables from WSM6. For a fair comparison, we compared the difference between CPU and GPU runs to the difference between CPU runs with different compilers. We also examined bias in these differences, which can distort the climatology of model simulation. Here, we have shown that our approach successfully passed our verification process. This represents the first successful application of GPU acceleration to the realistic full-model integration of MPAS.
geosciences, multidisciplinary,computer science, interdisciplinary applications
What problem does this paper attempt to address?