Optimizing sparse matrix-vector multiplication based on gpu

MENGJIA YIN,TAO ZHANG,XU XIANBIN,HU JIN,HE SHUIBING
2012-01-01
Abstract:In recent years, Graphics Processing Units(GPUs) have attracted the attention of many application developers as powerful massively parallel system. Computer Unified Device Architecture (CUDA) as a general purpose parallel computing architecture makes GPUs an appealing choice to solve many complex computational problems in a more efficient way. Sparse Matrix-vector Multiplication(SpMV) algorithm is one of the most important scientific computing kernel algorithms. In this paper, we proposed new parallelization algorithms that CSR-M based on CSR format and ELLPACK-R based on ELLPACK format, which are realized the parallelism kernel on GPU with CUDA. We discussed implementing optimizing SpMV on GPUs using CUDA programming model, the optimization strategies including: mapping thread, mergering access, reusing data, avoiding branch, optimization thread block. The experiment results showed the proposed optimization strategies can improve performance, memory bandwidth and reduce the execution time of kernel.
What problem does this paper attempt to address?