Alya towards Exascale: Optimal OpenACC Performance of the Navier-Stokes Finite Element Assembly on GPUs

Herbert Owen,Dominik Ernst,Thomas Gruber,Oriol Lemkuhl,Guillaume Houzeaux,Lucas Gasparino,Gerhard Wellein
2024-01-23
Abstract:This paper addresses the challenge of providing portable and highly efficient code structures for CPU and GPU architectures. We choose the assembly of the right-hand term in the incompressible flow module of the High-Performance Computational Mechanics code Alya, which is one of the two CFD codes in the Unified European Benchmark Suite. Starting from an efficient CPU-code and a related OpenACC-port for GPUs we successively investigate performance potentials arising from code specialization, algorithmic restructuring and low-level optimizations.
Distributed, Parallel, and Cluster Computing,Performance
What problem does this paper attempt to address?