An Optimized GP-GPU Warp Scheduling Algorithm for Sparse Matrix-Vector Multiplication

Lifeng Liu,Meilin Liu,Chong-Jun Wang
DOI: https://doi.org/10.1109/nas.2013.35
2013-01-01
Abstract:GP-GPUs have been used as the platform for many applications due to their powerful computation ability and massively parallel features. In this paper, we first investigate the CSR sparse matrix format, the performance of existing optimized SpMV (Sparse matrix-vector multiplication) algorithms, and analyze the memory access patterns of the SpMV algorithms. Based on the analysis of the memory access patterns, we propose a new thread scheduling technique that can take advantage of inter-warp locality and intra-warp locality simultaneously, and also can achieve memory coalescing automatically. This proposed new scheduling technique will change the memory access pattern of SpMVs significantly. The simulation results show that the performance of the SpMV using the new proposed thread scheduling technique achieves much better performance than the implementation of the SpMV optimized by other techniques.
What problem does this paper attempt to address?