GPU Optimization of Lattice Boltzmann Method with Local Ensemble Transform Kalman Filter
Yuta Hasegawa,Toshiyuki Imamura,Takuya Ina,Naoyuki Onodera,Yuuichi Asahi,Yasuhiro Idomura
DOI: https://doi.org/10.1109/ScalAH56622.2022.00007
2023-08-07
Abstract:The ensemble data assimilation of computational fluid dynamics simulations based on the lattice Boltzmann method (LBM) and the local ensemble transform Kalman filter (LETKF) is implemented and optimized on a GPU supercomputer based on NVIDIA A100 GPUs. To connect the LBM and LETKF parts, data transpose communication is optimized by overlapping computation, file I/O, and communication based on data dependency in each LETKF kernel. In two dimensional forced isotropic turbulence simulations with the ensemble size of $M=64$ and the number of grid points of $N_x=128^2$, the optimized implementation achieved $\times3.80$ speedup from the naive implementation, in which the LETKF part is not parallelized. The main computing kernel of the local problem is the eigenvalue decomposition (EVD) of $M\times M$ real symmetric dense matrices, which is computed by a newly developed batched EVD in $\verb|EigenG|$. The batched EVD in $\verb|EigenG|$ outperforms that in $\verb|cuSOLVER|$, and $\times65.3$ speedup was achieved.
Fluid Dynamics,Distributed, Parallel, and Cluster Computing,Computational Physics