Recommendations and specifications for data scope analysis tools

Cyril Mazauric,Erwan Raffin,David Guibert
DOI: https://doi.org/10.48550/arXiv.1908.06095
2019-08-16
Distributed, Parallel, and Cluster Computing
Abstract:This document is one of the deliverable reports created for the ESCAPE project. ESCAPE stands for Energy-efficient Scalable Algorithms for Weather Prediction at Exascale. The project develops world-class, extreme-scale computing capabilities for European operational numerical weather prediction and future climate models. This is done by identifying Weather & Climate dwarfs which are key patterns in terms of computation and communication (in the spirit of the Berkeley dwarfs). These dwarfs are then optimised for different hardware architectures (single and multi-node) and alternative algorithms are explored. Performance portability is addressed through the use of domain specific languages. In today's computer architectures, moving data is considerably more time- and energy consuming than computing on this data. One of the key performance optimizations for any application is therefore to minimize data motion and maximize data reuse. Especially on modern supercomputers with very complex and deep memory hierarchies, it is mandatory to take data locality into account. Especially when targeting accelerators with directive systems like OpenACC or OpenMP, identifying data scope, access type and data reuse are critical to minimize the data transfers from and to the accelerator. Unfortunately, manually identifying data locality information in complex code bases can be a time consuming task and tool support is therefore desirable. In this report we summarize the results of a survey of currently available tools that support software developers and performance engineers with data locality information in complex code bases like numerical weather prediction (NWP) or climate simulation applications. Based on the survey results we then recommend a tool and specify some extensions for a tool to solve the problems encountered in an NWP application.
What problem does this paper attempt to address?