Abstract:We introduce the Fast Free Memory method (FFM), a new fast method for the numerical evaluation of convolution products. Inheriting from the Fast Multipole Method, the FFM is a descent-only and kernel-independent algorithm. We give the complete algorithm and the relevant complexity analysis. While dense matrices arise normally in such computations, the linear storage complexity and the quasi-linear computational complexity enable the evaluation of convolution products featuring up to one billion entries. We show how we are able to solve complex scattering problems using Boundary Integral Equations with dozen of millions of unknowns. Our implementation is made freely available within the Gypsilab framework under the GPL 3.0 license.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to calculate convolution kernels efficiently, especially for matrix - vector multiplication problems involving a large number of nodes (such as hundreds of millions or billions of nodes). Traditional calculation methods face the problem of excessively high storage and calculation costs when dealing with large - scale data, which limits their application scale on personal computers and small servers. To solve this problem, the author introduced a new fast algorithm - the Fast Free Memory method (FFM), aiming to achieve efficient convolution kernel calculation with linear storage complexity and quasi - linear calculation complexity. ### Specific problem description 1. **Limitations of traditional methods**: - Traditional methods such as directly calculating convolution kernels will lead to dense matrices, and the storage and calculation costs increase quadratically with the number of nodes. - Even when using acceleration methods (such as the Fast Multipole Method, FMM), these methods usually depend on specific kernel functions and are not effective when dealing with high - frequency oscillating kernels. 2. **Goals of FFM**: - **Reduce storage requirements**: Through linear storage complexity, it is possible to handle large - scale matrix - vector multiplications containing billions of nodes even on laboratory - level servers. - **Improve calculation efficiency**: Through quasi - linear calculation complexity, the calculation time is significantly reduced, making it possible to handle large - scale problems. - **Generality**: FFM is an algorithm independent of kernel functions and can be applied to various types of convolution kernels, including non - oscillating kernels and oscillating kernels (such as the Helmholtz Green kernel). ### Overview of the solution FFM combines the advantages of multiple existing methods, such as the spatial partitioning of the Fast Multipole Method (FMM), the compression techniques of Hierarchical Matrices (H - matrices), and the interpolation method of Sparse Sine - Cosine Decomposition (SCSD). Specifically: - **Spatial partitioning**: FFM uses an octree for spatial partitioning to ensure that boxes at different levels have the same side length, thereby simplifying data interpolation. - **Compression techniques**: For long - distance interactions, use low - rank approximation; for short - distance interactions, use dense matrix calculation. - **Interpolation method**: Use Lagrange interpolation and Adaptive Cross Approximation (ACA) to compress the transfer matrix. - **Handling oscillating kernels**: For oscillating kernels (such as the Helmholtz Green kernel), use Gegenbauer series expansion and Non - Uniform Fast Fourier Transform (NUFFT) to improve calculation efficiency. Through these improvements, FFM can significantly reduce storage and calculation costs while ensuring accuracy, and is suitable for large - scale scientific computing tasks, such as solving boundary integral equations. ### Numerical experiment verification In the paper, the effectiveness of FFM is verified through numerical experiments, demonstrating its superior performance under linear and quasi - linear complexity. For example, when dealing with the convolution kernel calculation containing one billion nodes, FFM can complete the calculation within four hours, and the required memory is only about 100GB. In conclusion, the main contribution of this paper is to propose an efficient FFM algorithm, which solves the storage and calculation bottleneck problems in large - scale convolution kernel calculation, and provides a powerful tool for research in related fields.

The Fast and Free Memory Method for the efficient computation of convolution kernels

Fast multipole method accelerated by lifting wavelet transform scheme

Fast computation of radar cross-section by fast multipole method in conjunction with lifting wavelet-like transform

Fast Fourier transforms for the evaluation of convolution products: CPU versus GPU implementation

An Ultra-high-speed Reproducing Kernel Particle Method

A Fast Summation Method for translation invariant kernels

Fast Fourier Transform on Multipoles for Rapid Calculation of Magnetostatic Fields

Matrix-Free Finite Volume Kernels on a Dataflow Architecture

Parallel Multilevel Fast Multipole Method for Solving Large-Scale Problems

Fast Multipole Accelerated Scattering Matrix Method for Multiple Scattering of A Large Number of Cylinders

A low memory, highly concurrent multigrid algorithm

Memory footprint reduction for the FFT-based volume integral equation method via tensor decompositions

A Fast Multipole Method for axisymmetric domains

Fast Evaluation of Additive Kernels: Feature Arrangement, Fourier Methods, and Kernel Derivatives

Performant low-order matrix-free finite element kernels on GPU architectures

Fast hardware-aware matrix-free algorithm for higher-order finite-element discretized matrix multivector products on distributed systems

A Dual-space Multilevel Kernel-splitting Framework for Discrete and Continuous Convolution

Fast hardware-aware matrix-free algorithms for higher-order finite-element discretized matrix multivector products on distributed systems

A SVD accelerated kernel-independent fast multipole method and its application to BEM

Lightning-fast Method of Fundamental Solutions

The Fast Kernel Transform