A Throughput-Aware Analytical Performance Model for GPU Applications

zhidan hu,guangming liu,wenrui dong
DOI: https://doi.org/10.1007/978-3-662-44491-7_8
2014-01-01
Abstract:Graphics processing units (GPUs) have shown increased popularity in general-purpose parallel processing. This massively parallel architecture allows GPUs to execute tens of thousands of threads in parallel to solve heavily data-parallel problems efficiently. However, despite the tremendous computing power, optimizing GPU kernels to achieve high performance is still a challenge due to the sea change from CPU to GPU and lacking of tools for programming and performance analysis.
What problem does this paper attempt to address?