Parallel Sparse Matrix Multiplication for Preconditioning and SSTA on a Many-Core Architecture

Keliang Zhang,Baifeng Wu
DOI: https://doi.org/10.1109/nas.2012.11
2012-01-01
Abstract:Operations related to Sparse matrix multiplication are frequently used in scientific computing area, and these operations usually become a performance bottleneck because of their high operational complexity. For example, sparse matrix multiplying diagonal matrix (CS) is a key sub-procedure in preconditioning, and sparse matrix multiplying one-dimension block diagonal matrix (BCS) is a key sub-procedure for statistical static timing analysis (SSTA) without slope propagation based on sparse matrix framework. Although ELLH format along with its variant is suited to many-core architecture for spare matrix multiplying vector (SpMV) operation, for CS operation it leads to large amount of memory access due to accessing the column index matrix, for BCS operation it not only leads to larger amount of memory access, but also brings high computational complexity during parallel programming due to the complex data dependencies among matrix elements. This paper presents a new sparse format (named ELLV format). For CS operation, the number of memory access can be reduced by half because of no requirement of accessing the matrix for column index. Experiment result shows that with our ELLV format the performance of CS operation can be improved by 15% versus with ELLH format. For BCS operation, due to consistency of column index between the logical matrices and the physical matrices, not only the number of memory access can be reduced more remarkably, but also bring efficient and straightforward parallel programming on a many-core architecture.
What problem does this paper attempt to address?