FLUPS: A Fourier-Based Library of Unbounded Poisson Solvers

Denis-Gabriel Caprace,Thomas Gillis,Philippe Chatelain
DOI: https://doi.org/10.1137/19M1303848
2021-01-15
SIAM Journal of Scientific Computing
Abstract:SIAM Journal on Scientific Computing, Volume 43, Issue 1 , Page C31-C60, January 2021. A Fourier-based library of unbounded Poisson solvers (FLUPS) for 2D and 3D homogeneous distributed grids is presented. It is designed to handle every possible combination of periodic, symmetric, semi-unbounded, and fully unbounded boundary conditions for the Poisson equation on rectangular domains with uniform resolution. FLUPS leverages a dedicated implementation of 3D Fourier transforms to solve the Poisson equation using Green's functions in a fast and memory-efficient way. Several Green's functions are available, optionally with explicit regularization, spectral truncation, or using lattice Green's functions, and provide verified convergence orders from 2 to spectral-like. The algorithm depends on the FFTW library to perform 1D transforms, while message passing interface (MPI) communications enable the required remapping of data in memory. For the latter operation, a first available implementation resorts to the standard all-to-all routines. A second implementation, featuring non-blocking and persistent point-to-point communications, is however shown to be more efficient in a majority of cases and especially while taking advantage of the shared memory parallelism with OpenMP. The scalability of the algorithm, aimed at massively parallel architectures, is demonstrated up to 73720 cores. The results obtained with three different supercomputers show that the weak efficiency remains above and the strong efficiency above when the number of cores is multiplied by 16, for typical problems. These figures are slightly better than those expected from a third party 3D fast Fourier transform (FFT) tool, with which a longer execution time was also measured on average. From the outside, the solving procedure is fully automated so that the user benefits from the optimal performances while not having to handle the complexity associated with memory management, data mapping, and Fourier transform computation. The parallel code is available under Apache license 2.0 at github.com/vortexlab-uclouvain/flups.
What problem does this paper attempt to address?