Performance Modeling and Tuning for DFT Calculations on Heterogeneous Architectures

Hadia Ahmed,David B. Williams-Young,Khaled Z. Ibrahim,Chao Yang
DOI: https://doi.org/10.1109/ipdpsw52791.2021.00108
2021-01-01
Abstract:Tuning scientific code for heterogeneous computing architecture is a growing challenge. Not only do we need to tune the code to multiple architectures, but also we need to select or schedule computations to the most efficient compute variant. In this paper, we explore the tuning and performance modeling question of one of the most time computing kernels in density functional theory calculations on systems with a multicore host CPU accelerated with GPUs. We show the problem configuration dictates the choice of the most efficient compute engine. Such choice could alternate between the host and the accelerator, especially while scaling. As such, a performance model to predict the execution time on the host CPU and GPU is essential to select the compute environment and to achieve optimal performance. We present a simple model that empirically carry out such tasks and could accurately steer the scheduling of computation.
What problem does this paper attempt to address?