Performance evaluation and analysis of sparse matrix and graph kernels on heterogeneous processors

Feng Zhang,Weifeng Liu,Ningxuan Feng,Jidong Zhai,Xiaoyong Du
DOI: https://doi.org/10.1007/s42514-019-00008-6
2019-01-01
CCF Transactions on High Performance Computing
Abstract:Heterogeneous processors integrate very distinct compute resources such as CPUs and GPUs into the same chip, thus can exploit the advantages and avoid disadvantages of those compute units. We in this work evaluate and analyze eight sparse matrix and graph kernels on an AMD CPU–GPU heterogeneous processor by using 956 sparse matrices. Five characteristics, i.e., load balancing , indirect addressing , memory reallocation , atomic operations , and dynamic characteristics are our major considerations. The experimental results show that although the CPU and GPU parts access the same DRAM, very different performance behaviors are observed. For example, though the GPU part in general outperforms the CPU part, it cannot achieve the best performance in all cases given by the CPU part. Moreover, the bandwidth utilization of atomic operations on heterogeneous processors can be much higher than a high-end discrete GPU.
What problem does this paper attempt to address?