Evaluating performance portability of five shared-memory programming models using a high-order unstructured CFD solver
Zhe Dai,Liang Deng,YongGang Che,Ming Li,Jian Zhang,Yueqing Wang
DOI: https://doi.org/10.1016/j.jpdc.2023.104831
IF: 4.542
2024-01-01
Journal of Parallel and Distributed Computing
Abstract:This paper presents implementing and optimizing a high-order unstructured computational fluid dynamics (CFD) solver using five shared-memory programming models: CUDA, OpenACC, OpenMP, Kokkos, and OP2. The study aims to evaluate the performance of these models on different hardware architectures, including NVIDIA GPUs, x86-based Intel/AMD, and Arm-based systems. The goal is to determine whether these models can provide developers with performance-portable solvers running efficiently on various architectures. The paper forms a more holistic view of a high-order solver across multiple platforms by visualizing performance portability (PP) and measuring productivity. It gives guidelines for translating existing codebases and their data structures to these models.
computer science, theory & methods