Abstract:A variety of code analyzers, such as IACA, uiCA, llvm-mca or Ithemal, strive to statically predict the throughput of a computation kernel. Each analyzer is based on its own simplified CPU model reasoning at the scale of a basic block. Facing this diversity, evaluating their strengths and weaknesses is important to guide both their usage and their enhancement. We present CesASMe, a fully-tooled solution to evaluate code analyzers on C-level benchmarks composed of a benchmark derivation procedure that feeds an evaluation harness. We conclude that memory-carried data dependencies are a major source of imprecision for these tools. We tackle this issue with staticdeps, a static analyzer extracting memory-carried data dependencies, including across loop iterations, from an assembly basic block. We integrate its output to uiCA, a state-of-the-art code analyzer, to evaluate staticdeps' impact on a code analyzer's precision through CesASMe.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper is mainly dedicated to solving the accuracy problems encountered by code analyzers when predicting the throughput of computing kernels, especially the challenges related to memory - carried dependencies. Specifically: 1. **Diversity and limitations of code analyzers**: - Many code analysis tools (such as IACA, uiCA, llvm - mca, and Ithemal) attempt to statically predict the throughput of computing kernels, but they are based on different simplified CPU models and have limitations when dealing with basic blocks. - Evaluating the advantages and disadvantages of these tools is very important for guiding their use and improvement. 2. **Challenges of memory - carried dependencies**: - Memory - carried data dependencies are one of the main reasons for the inaccurate predictions of these tools. Such dependencies are difficult to model, especially in the case of cross - loop iterations. - Existing code analyzers have difficulties in dealing with memory dependencies, resulting in inaccurate prediction results. 3. **Proposing solutions**: - The author proposes a tool named **CesASMe** for evaluating the performance of code analyzers on C - level benchmarks. CesASMe includes a benchmark derivation process that can generate micro - benchmarks and perform evaluations. - To address the problem of memory - carried dependencies, the author also develops a static analyzer named **staticdeps**, which can extract memory - carried data dependencies, including cross - loop iteration cases. - The output of staticdeps is integrated into uiCA to evaluate its impact on the accuracy of the code analyzer. Through these methods, the author aims to improve the accuracy of code analyzers in dealing with memory dependencies and provide more reliable results for performance prediction. ### Formulas involved - **Relative error formula**: \[ \text{err} = \left| \frac{C_{\text{pred}} - C_{\text{baseline}}}{C_{\text{baseline}}} \right| \] where \( C_{\text{pred}} \) is the predicted number of cycles and \( C_{\text{baseline}} \) is the number of cycles measured at the baseline. - **Lifted prediction formula**: \[ \text{lifted\_pred}(K) = \sum_{b \in \text{BBs}(K)} \text{occurrences}(b) \times \text{pred}(b) \] where \( K \) is the kernel, \( \text{BBs}(K) \) is the set of basic blocks in the kernel, \( \text{occurrences}(b) \) is the number of times the basic block \( b \) appears, and \( \text{pred}(b) \) is the predicted throughput of the basic block \( b \). Through these formulas, the author can quantify and compare the prediction accuracy of different code analyzers.

CesASMe and Staticdeps: static detection of memory-carried dependencies for code analyzers

Self-adaptive static analysis

Targeted Static Analysis for OCaml C Stubs: eliminating gremlins from the code

Static Analysis for Fast and Accurate Design Space Exploration of Caches

SCAD: Controlled Memory Allocation Analysis and Detection

VBSAC: a value-based static analyzer for C

Scalable and Extensible Static Memory Safety Analysis with Summary over Access Path

Static Reuse Profile Estimation for Array Applications

Integrating Static Code Analysis Toolchains

Improving Memory Dependence Prediction with Static Analysis

LLVM Static Analysis for Program Characterization and Memory Reuse Profile Estimation

Finding and Understanding Defects in Static Analyzers by Constructing Automated Oracles

Cmad:A C Memory Access Errors Detector Based on Semantics Abstraction

PhASAR: An Inter-procedural Static Analysis Framework for C/C++

A Source-Level Instrumentation Framework for the Dynamic Analysis of Memory Safety

Examem: Low-Overhead Memory Instrumentation for Intelligent Memory Systems

Applied static analysis and specialization of cross-core syscalls for multi-core AUTOSAR OS

Implementing and Executing Static Analysis Using LLVM and CodeChecker

Heaps Don't Lie: Countering Unsoundness with Heap Snapshots

Static Dependency Analysis For Concurrent Ada 95 Programs

Deep Static Modeling of invokedynamic