An Optimized Sparse Approximate Matrix Multiply for Matrices with Decay

Nicolas Bock,Matt Challacombe
DOI: https://doi.org/10.1137/120870761
IF: 2.968
2013-01-01
SIAM Journal on Scientific Computing
Abstract:We present an optimized single-precision implementation of the sparse approximate matrix multiply (SpAMM}) [M. Challacombe and N. Bock, arXiv 1011.3534, 2010], a fast algorithm for matrix-matrix multiplication for matrices with decay that achieves an $\mathcal{O} ( n \log n )$ computational complexity with respect to matrix dimension $n$. We find that the max norm of the error achieved with a SpAMM tolerance below $2 \times 10^{-8}$ is lower than that of the single-precision general matrix-matrix multiply (SGEMM}) for dense quantum chemical matrices, while outperforming SGEMM with a cross-over already for small matrices ($n \sim 1000$). Relative to naive implementations of SpAMM using Intel's Math Kernel Library or AMD's Core Math Library, our optimized version is found to be significantly faster. Detailed performance comparisons are made for quantum chemical matrices with differently structured sub-blocks. Finally, we discuss the potential of improved hardware prefetch to yield 2x to 3x speedups.
mathematics, applied
What problem does this paper attempt to address?