Abstract:We introduce the Fast Free Memory method (FFM), a new fast method for the numerical evaluation of convolution products. Inheriting from the Fast Multipole Method, the FFM is a descent-only and kernel-independent algorithm. We give the complete algorithm and the relevant complexity analysis. While dense matrices arise normally in such computations, the linear storage complexity and the quasi-linear computational complexity enable the evaluation of convolution products featuring up to one billion entries. We show how we are able to solve complex scattering problems using Boundary Integral Equations with dozen of millions of unknowns. Our implementation is made freely available within the Gypsilab framework under the GPL 3.0 license.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to calculate convolution kernels efficiently, especially for matrix - vector multiplication problems involving a large number of nodes (such as hundreds of millions or billions of nodes). Traditional calculation methods face the problem of excessively high storage and calculation costs when dealing with large - scale data, which limits their application scale on personal computers and small servers. To solve this problem, the author introduced a new fast algorithm - the Fast Free Memory method (FFM), aiming to achieve efficient convolution kernel calculation with linear storage complexity and quasi - linear calculation complexity.
### Specific problem description
1. **Limitations of traditional methods**:
- Traditional methods such as directly calculating convolution kernels will lead to dense matrices, and the storage and calculation costs increase quadratically with the number of nodes.
- Even when using acceleration methods (such as the Fast Multipole Method, FMM), these methods usually depend on specific kernel functions and are not effective when dealing with high - frequency oscillating kernels.
2. **Goals of FFM**:
- **Reduce storage requirements**: Through linear storage complexity, it is possible to handle large - scale matrix - vector multiplications containing billions of nodes even on laboratory - level servers.
- **Improve calculation efficiency**: Through quasi - linear calculation complexity, the calculation time is significantly reduced, making it possible to handle large - scale problems.
- **Generality**: FFM is an algorithm independent of kernel functions and can be applied to various types of convolution kernels, including non - oscillating kernels and oscillating kernels (such as the Helmholtz Green kernel).
### Overview of the solution
FFM combines the advantages of multiple existing methods, such as the spatial partitioning of the Fast Multipole Method (FMM), the compression techniques of Hierarchical Matrices (H - matrices), and the interpolation method of Sparse Sine - Cosine Decomposition (SCSD). Specifically:
- **Spatial partitioning**: FFM uses an octree for spatial partitioning to ensure that boxes at different levels have the same side length, thereby simplifying data interpolation.
- **Compression techniques**: For long - distance interactions, use low - rank approximation; for short - distance interactions, use dense matrix calculation.
- **Interpolation method**: Use Lagrange interpolation and Adaptive Cross Approximation (ACA) to compress the transfer matrix.
- **Handling oscillating kernels**: For oscillating kernels (such as the Helmholtz Green kernel), use Gegenbauer series expansion and Non - Uniform Fast Fourier Transform (NUFFT) to improve calculation efficiency.
Through these improvements, FFM can significantly reduce storage and calculation costs while ensuring accuracy, and is suitable for large - scale scientific computing tasks, such as solving boundary integral equations.
### Numerical experiment verification
In the paper, the effectiveness of FFM is verified through numerical experiments, demonstrating its superior performance under linear and quasi - linear complexity. For example, when dealing with the convolution kernel calculation containing one billion nodes, FFM can complete the calculation within four hours, and the required memory is only about 100GB.
In conclusion, the main contribution of this paper is to propose an efficient FFM algorithm, which solves the storage and calculation bottleneck problems in large - scale convolution kernel calculation, and provides a powerful tool for research in related fields.