Optimizations on Sparse Matrix-Vector Multiplication Based on CUDA

Zhou Hong,Fan Xiaoya,Zhao Lili
DOI: https://doi.org/10.16526/j.cnki.11-4762/tp.2010.08.039
2010-01-01
Abstract:With the development of VLSI technology,the idea of integrating multiple cores become realistic.Modern GPU is just a typical multi-core device.Because of the rapid evolution of computation-intensive application,the current GPU has the capability to complete the general computation.This paper first introduce the knowledge of CUDA and Sparse Matrix.Based on the CSR format of sparse matrix, three optimization methods of programme are presented under the CUDA model on the paper.They are all analyzed and implemented.Experiment is done on the Geforce 9600GT,and the final result shows that almost 4x speedup was achieved in contrast with the CPU computing.
What problem does this paper attempt to address?