Accelerating the Lagrangian Particle Tracking in Hydrologic Modeling at Continental-Scale
Chen Yang,Carl Ponder,Bei Wang,Hoang Tran,Jun Zhang,Jackson Swilley,Laura Condon,Reed Maxwell
DOI: https://doi.org/10.5194/egusphere-egu22-3285
IF: 8.469
2023-01-01
Journal of Advances in Modeling Earth Systems
Abstract:Unprecedented climate change and anthropogenic activities have induced increasing ecohydrological issues. Large-scale hydrologic modeling of water quantity is developing rapidly to seek solutions for those issues. Water-parcel transport (e.g., water age, water quality) is as important as water quantity to understand the changing water cycle. However, current scientific progress in water-parcel transport at large-scale is far behind that in water quantity. The known cause is the lack of powerful tools to handle observations and/or modeling of water-parcel transport at large-scale with high spatiotemporal resolutions. Lagrangian particle tracking based on integrated hydrologic modeling stands out among other methods because it accurately captures the water-parcel movements. Nonetheless, Lagrangian approach is computationally expensive, hindering its broad application in hydrologic modeling, particularly at large-scale. EcoSLIM, a grid-based particle tracking code, calculates water ages (e.g., evapotranspiration, outflow, and groundwater) and identifies source water composition (e.g., rainfall, snowmelt, and initial subsurface water), working seamlessly with the integrated hydrologic model ParFlow-CLM. EcoSLIM is written in Fortran and is originally parallelized by OpenMP (Open Multi-Processing) using shared CPU memory. As a result, we accelerate EcoSLIM by implementing it on distributed, multi-GPU platform using CUDA (Compute Unified Device Architecture) Fortran.We decompose the modeling domain into subdomains. Each GPU is responsible for one subdomain. Particles moving out of a subdomain continue moving temporarily in halo grid-cells around the subdomain and then are transferred to the neighbor subdomains. Different transfer schemes are built to balance the simulation accuracy and the computing speed. Particle transfer leverages the CUDA-aware MPI (Message Passing Interface) to improve the parallel efficiency. Load imbalance among GPUs induced by irregular domain boundaries and heterogeneity of flow paths is observed. A load-balancing scheme, borrowed from Particle-In-Cell and modified based on the characteristics of EcoSLIM, is established. The simulation starts on a number of GPUs fewer than the total scheduled GPUs. The manager MPI process activates an idle GPU for a subdomain once the particle number on its current GPU(s) is over a specified threshold. Finally, all scheduled GPUs are enabled. Tests of the new code from catchment-scale (the Little Washita watershed), to regional-scale (the North China Plain), and then to continental-scale (the Continental US) using millions to billions of particles show significant speedup and great parallel performance. The parallelized EcoSLIM is a promising tool for the hydrologic community to accelerate our understanding of the terrestrial water cycle beyond the water balance in the changing world.