Efficient Executions of Community Earth System Model Onto Accelerators Using GPUs.

Shijin Yuan,Cheng Wang,Bin Mu,Xiaodan Luo
DOI: https://doi.org/10.1145/3449301.3449334
2020-01-01
Abstract:As the climate models become more and more complicated, we are facing an enormous challenge to run these models effectively. In this paper, we discuss the acceleration of the Community Earth System Model (CESM), which is a large-scaled model with MPI parallel, but still with low execution efficiency. We have conducted an efficient study on porting the Community Land Model (CLM) which an active component within CESM onto Graphics Processing Unit (GPU), and we focus on one major routine that occupies the most execution time, namely CanopyFluxes. To expedite computation, we have put tremendous effort into developing accelerated the CESM model using GPU to parallel computing. Specifically, we conducted CUDA kernel command to optimize some matrix computations in CanopyFluxes. For further optimization, GPU caches and compiler options are used. Running on a five computing nodes cluster with five GPUs, the CanopyFluxes routine achieves a speedup of 4.21x. While in the simulation on Tianhe-2 with NVIDIA Tesla K80 GPUs, the speedup of CanopyFluxes routine raises to 14.92x.
What problem does this paper attempt to address?