Implementation of a Lattice Boltzmann Method for Large Eddy Simulation on Multiple GPUs

Qinjian Li,Chengwen Zhong,Kai Li,Guangyong Zhang,Xiaowei Lu,Qing Zhang,Kaiyong Zhao,Xiaowen Chu
DOI: https://doi.org/10.1109/HPCC.2012.115
2012-01-01
Abstract:Recently, the Graphic Processor Unit (GPU) has evolved into a highly parallel, multithreaded, many-core processor with tremendous computational horsepower and very high memory bandwidth. To improve the simulation efficiency of complex flow phenomena in the field of computational fluid dynamics, a CUDA-based simulation algorithm of large eddy simulation using multiple GPUs is proposed. Our implementation adopted the "collision after propagation" scheme and performed the propagation process by global memory reading transactions. The working set is split up into equal sub-domains and assigned to each GPU for simplicity. Using recently released hardware, up to four GPUs can be controlled by a single CPU thread and run in parallel. The results show that our multi-GPU implementation could perform simulations on a rather large scale (meshes: 10240脳10240) even using double-precision floating point calculation and achieved 190X speedup over the sequential implementation on CPU.
What problem does this paper attempt to address?