Experience and Analysis of Scalable High-Fidelity Computational Fluid Dynamics on Modular Supercomputing Architectures

Martin Karp,Estela Suarez,Jan H. Meinke,Måns I. Andersson,Philipp Schlatter,Stefano Markidis,Niclas Jansson
2024-05-09
Abstract:The never-ending computational demand from simulations of turbulence makes computational fluid dynamics (CFD) a prime application use case for current and future exascale systems. High-order finite element methods, such as the spectral element method, have been gaining traction as they offer high performance on both multicore CPUs and modern GPU-based accelerators. In this work, we assess how high-fidelity CFD using the spectral element method can exploit the modular supercomputing architecture at scale through domain partitioning, where the computational domain is split between a Booster module powered by GPUs and a Cluster module with conventional CPU nodes. We investigate several different flow cases and computer systems based on the modular supercomputing architecture (MSA). We observe that for our simulations, the communication overhead and load balancing issues incurred by incorporating different computing architectures are seldom worthwhile, especially when I/O is also considered, but when the simulation at hand requires more than the combined global memory on the GPUs, utilizing additional CPUs to increase the available memory can be fruitful. We support our results with a simple performance model to assess when running across modules might be beneficial. As MSA is becoming more widespread and efforts to increase system utilization are growing more important our results give insight into when and how a monolithic application can utilize and spread out to more than one module and obtain a faster time to solution.
Distributed, Parallel, and Cluster Computing,Mathematical Software,Fluid Dynamics
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is: How to efficiently conduct large - scale high - fidelity Computational Fluid Dynamics (CFD) simulations on the Modular Supercomputing Architecture (MSA), especially when the simulation scale exceeds the processing capacity of a single module, how to rationally allocate computing tasks to improve performance and reduce solution time. ### Specific problems include: 1. **Utilization of Modular Supercomputing Architecture**: - How to allocate computing tasks to different computing modules (such as GPU modules and CPU modules) through domain decomposition in MSA, thereby achieving efficient parallel computing. - Explore the optimal operation mode of different - sized workloads on the heterogeneous MSA system. 2. **Communication Overhead and Load Balancing**: - Analyze the impact of communication overhead and load - balancing problems on performance when different computing architectures (such as GPU and CPU) are used in combination. - Evaluate when and how computing tasks can be allocated among multiple modules to reduce solution time. 3. **Application of Performance Models**: - Use a simple performance model to evaluate the potential performance gain when running on multiple computing modules. - Determine under which circumstances using multiple modules can significantly improve performance, especially when the problem scale is large and cannot be fully accommodated in one module. 4. **Challenges in Practical Applications**: - Explore whether it is attractive to use different computing modules in combination in the actual production environment, and provide specific examples of performance improvement. ### Core contributions of the paper: - **Empirical Comparison**: Evaluate the performance of different flow configurations under different GPU/CPU configurations, including the impact of I/O on load balancing. - **Performance Model**: Develop a simple performance model for analyzing and evaluating the performance potential when running on multiple architectures. - **Performance Improvement**: When the simulation cannot fully adapt to the GPU module, a performance improvement of up to 2.7 times is observed by using the GPU and CPU modules simultaneously. ### Conclusion: Through empirical research and performance models, the paper explores the best practices for conducting large - scale high - fidelity CFD simulations on the modular supercomputing architecture. The research results provide an important reference for optimizing CFD simulations on heterogeneous computing platforms in the future, especially in terms of how to rationally allocate computing resources to improve performance and reduce solution time.