An Asynchronous Parallel Implementation of Multilevel Fast Multipole Algorithm on GPU Cluster for 3D Electromagnetic Scattering Problems

Rong-Ping Xi,We-Jia He,Ming-Lin Yang,Xin-Qing Sheng
DOI: https://doi.org/10.23919/aces-china52398.2021.9581392
2021-01-01
Symposium
Abstract:This paper presents a CPU/GPU asynchronous computing pattern based improved parallel multilevel fast multipole algorithm (MLFMA) for 3D electromagnetic scattering problems on GPU Cluster. In the presented parallel implementation, the matrix assembly process of the MLFMA is decomposed into CPU execution and GPU execution two parts. The former is performed on CPU using OpenMP multi-threading programming model, while the latter is performed on GPU with CUDA programming model. The execution time between the two parts is overlapped by using the feature of asynchronous execution between CPU and GPU. The performance of the proposed parallel implementation is investigated in terms of accuracy and efficiency. Numerical results show that, with the proposed parallel approach, over 10% speed-up can be attained, compared with the original parallel implementation.
What problem does this paper attempt to address?