FinePar: Irregularity-aware Fine-Grained Workload Partitioning on Integrated Architectures

Feng Zhang,Bo Wu,Jidong Zhai,Bingsheng He,Wenguang Chen
DOI: https://doi.org/10.1109/tkde.2019.2940184
2017-01-01
Abstract:The integrated architecture that features both CPU and GPU on the same die is an emerging and promising architecture for fine-grained CPU-GPU collaboration. However, the integration also brings forward several programming and system optimization challenges, especially for irregular applications. The complex interplay between heterogeneity and irregularity leads to very low processor utilization of running irregular applications on integrated architectures. Furthermore, fine-grained co-processing on the CPU and GPU is still an open problem. Particularly, in this paper, we show that the previous workload partitioning for CPU-GPU co-processing is far from ideal in terms of resource utilization and performance. To solve this problem, we propose a system software named FinePar, which considers architectural differences of the CPU and GPU and leverages finegrained collaboration enabled by integrated architectures. Through irregularity-aware performance modeling and online auto-tuning, FinePar partitions irregular workloads and achieves both device-level and thread-level load balance. We evaluate FinePar with 8 irregular applications on an AMD integrated architecture and compare it with state-of-the-art partitioning approaches. Results show that FinePar demonstrates better resource utilization and achieves an average of 1.38X speedup over the optimal coarse-grained partitioning method.
What problem does this paper attempt to address?