Providing performance portable numerics for Intel GPUs

Yu‐Hsiang M. Tsai,Terry Cojean,Hartwig Anzt
DOI: https://doi.org/10.1002/cpe.7400
2022-10-27
Concurrency and Computation: Practice and Experience
Abstract:Summary With discrete Intel GPUs entering the high‐performance computing landscape, there is an urgent need for production‐ready software stacks for these platforms. In this article, we report how we enable the Ginkgo math library to execute on Intel GPUs by developing a kernel backed based on the DPC++ programming environment. We discuss conceptual differences between the CUDA and DPC++ programming models and describe workflows for simplified code conversion. We evaluate the performance of basic and advanced sparse linear algebra routines available in Ginkgo's DPC++ backend in the hardware‐specific performance bounds and compare against routines providing the same functionality that ship with Intel's oneMKL vendor library.
What problem does this paper attempt to address?