Improving the Space-Time Efficiency of Matrix Multiplication Algorithms

Yuan Tang
DOI: https://doi.org/10.1145/3409390.3409404
2020-01-01
Abstract:Classic cache-oblivious parallel matrix multiplication algorithms achieve optimality either in time or space, but not both, which promotes lots of research on the best possible balance or trade-off of such algorithms. We study modern processor-oblivious runtime systems and figure out several ways to improve algorithm's time complexity while still bounding space and cache requirements to be asymptotically optimal. By our study, we give out sub-linear time, optimal work, space and caching algorithms for both general matrix multiplication on a semiring and Strassen-like fast algorithms on a ring. Our experiments show such algorithms have empirical advantages over classic counterparts. Our study provides new insights and research angles on how to optimize cache-oblivious parallel algorithms from both theoretical and empirical perspectives.
What problem does this paper attempt to address?