Efficient sparse-matrix multi-vector product on GPUs

Changwan Hong,Aravind Sukumaran-Rajam,Bortik Bandyopadhyay,Jinsung Kim,Süreyya Emre Kurt,Israt Nisa,Shivani Sabhlok,Ümit V. Çatalyürek,Srinivasan Parthasarathy,P. Sadayappan
DOI: https://doi.org/10.1145/3208040.3208062
2018-06-11
Abstract:Sparse Matrix-Vector (SpMV) and Sparse Matrix-Multivector (SpMM) products are key kernels for computational science and data science. While GPUs offer significantly higher peak performance and memory bandwidth than multicore CPUs, achieving high performance on sparse computations on GPUs is very challenging. A tremendous amount of recent research has focused on various GPU implementations of the SpMV kernel. But the multi-vector SpMM kernel has received much less attention. In this paper, we present an in-depth analysis to contrast SpMV and SpMM, and develop a new sparse-matrix representation and computation approach suited to achieving high data-movement efficiency and effective GPU parallelization of SpMM. Experimental evaluation using the entire SuiteSparse matrix suite demonstrates significant performance improvement over existing SpMM implementations from vendor libraries.
What problem does this paper attempt to address?