M2L Translation Operators for Kernel Independent Fast Multipole Methods on Modern Architectures

Srinath Kailasa,Timo Betcke,Sarah El Kazdadi
2024-08-14
Abstract:Current and future trends in computer hardware, in which the disparity between available flops and memory bandwidth continues to grow, favour algorithm implementations which minimise data movement even at the cost of more flops. In this study we review the requirements for high performance implementations of the kernel independent Fast Multipole Method (kiFMM), a variant of the crucial FMM algorithm for the rapid evaluation of N-body potential problems. Performant implementations of the kiFMM typically rely on Fast Fourier Transforms for the crucial M2L (Multipole-to-Local) operation. However, in recent years for other FMM variants such as the black-box FMM also BLAS based M2L translation operators have become popular that rely on direct matrix compression techniques. In this paper we present algorithmic improvements for BLAS based M2L translation operator and benchmark them against FFT based M2L translation operators. In order to allow a fair comparison we have implemented our own high-performance kiFMM algorithm in Rust that performs competitively against other implementations, and allows us to flexibly switch between BLAS and FFT based translation operators.
Computational Engineering, Finance, and Science
What problem does this paper attempt to address?