AMF-CSR: Adaptive Multi-Row Folding of CSR for SpMV on GPU.

Jianhua Gao,Weixing Ji,Jie Liu,Senhao Shao,Yizhuo Wang,Feng Shi
DOI: https://doi.org/10.1109/icpads53394.2021.00058
2021-01-01
Abstract:SpMV is a cost-dominant operation used in many iterative methods for solving large-scale sparse linear systems. However, irregular memory access of SpMV to the multiplied vector leads to low data locality and then harms the performance. This paper presents an adaptive multi-row folding of CSR (AMF-CSR) format for SpMV calculation on GPU. This new storage format supports the folding of the variable number of rows in order to achieve better load balancing in computation. AMF-CSR not only increases the density of non-zero elements in a folded row, thereby improving the access locality of the multiplied vector, but also merges an approximately equal number of nonzero elements in a folded row, hence achieving load balancing. The performance evaluation using 28 sparse matrices shows that the proposed SpMV algorithm based on AMF-CSR achieves the highest speedup of 4.11x and 3.62x on GTX 1080 Ti and Tesla V100 respectively against a fixed multi-row folding-based SpMV algorithm. Evaluation results using 450 regular sparse matrices and 450 irregular sparse matrices also show that AMF-CSR is superior to other SpMV implementations.
What problem does this paper attempt to address?