Petascale turbulence simulation using a highly parallel fast multipole method on GPUs

Rio Yokota,L.A. Barba,Tetsu Narumi,Kenji Yasuoka
DOI: https://doi.org/10.1016/j.cpc.2012.09.011
IF: 4.717
2013-03-01
Computer Physics Communications
Abstract:This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on gpu hardware using single precision. The simulations use a vortex particle method to solve the Navier–Stokes equations, with a highly parallel fast multipole method (fmm) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the fft algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the fmm-based vortex method achieving 74% parallel efficiency on 4096 processes (one gpu per mpi process, 3 gpus per node of the tsubame-2.0 system). The fft-based spectral method is able to achieve just 14% parallel efficiency on the same number of mpi processes (using only cpu cores), due to the all-to-all communication pattern of the fft algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.
physics, mathematical,computer science, interdisciplinary applications
What problem does this paper attempt to address?