Abstract:Various static analysis problems are reformulated as instances of the Context-Free Language Reachability (CFL-r) problem. One promising way to make solving CFL-r more practical for large-scale interprocedural graphs is to reduce CFL-r to linear algebra operations on sparse matrices, as they are efficiently executed on modern hardware. In this work, we present five optimizations for a matrix-based CFL-r algorithm that utilize the specific properties of both the underlying semiring and the widely-used linear algebra library SuiteSparse:GraphBlas. Our experimental results show that these optimizations result in orders of magnitude speedup, with the optimized matrix-based CFL-r algorithm consistently outperforming state-of-the-art CFL-r solvers across four considered static analyses.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to improve the solving efficiency of the Context - Free Language Reachability (CFL - r) problem, especially the performance on large - scale interprocedural graphs. Specifically, the CFL - r problem is a core problem in static analysis, which involves finding paths defined by context - free languages (CFL) in labeled graphs. Many static analysis tasks can be transformed into CFL - r problems, such as alias analysis, pointer analysis, value - flow analysis, and fixing compilation errors. To solve this problem, the author proposes a matrix - based CFL - r algorithm and optimizes this algorithm by taking advantage of the modern hardware's efficient execution ability for sparse matrix operations. Specifically, the author proposes five optimization measures, which improve the bottlenecks in matrix multiplication and element - level union operations. Through these optimizations, the author achieves an order - of - magnitude speed improvement, and the optimized matrix - based CFL - r algorithm is always superior to the existing state - of - the - art CFL - r solvers in four static analysis tasks. ### Overview of Optimization Measures 1. **Matrix Multiplication Optimization**: - Replace the original matrix multiplication \( M \cdot_{R_Gr} M \) with \( (M_{old} \cdot_{R_Gr} \Delta M) \cup (\Delta M \cdot_{R_Gr} M) \), where \( \Delta M = M \setminus M_{old} \) is the element - level set difference. This reduces duplicate calculations. 2. **Sparse Matrix Format Optimization**: - Maintain two copies of the matrix \( M \), stored in row - major and column - major formats respectively. Select the appropriate format for multiplication operations according to the sparsity of the matrix to improve computational efficiency. 3. **Matrix Storage Optimization**: - Instead of storing \( M \) as a single matrix, decompose it into multiple sub - matrices \( eM=\{M_1, M_2,\ldots, M_p\} \) and merge these sub - matrices according to specific rules to reduce memory reconstruction overhead. 4. **CFG Production Rule Optimization**: - For CFGs with a large number of production rules, use "index" non - terminals to reduce the number of boolean matrix multiplications. For example, all rules of the form \( A_R_i \to A_{ret_i} \) are counted as only one rule. 5. **CFG Transformation Optimization**: - For Java's field - sensitive pointer analysis and C/C++'s field - insensitive alias analysis, manually transform CFGs to Weak Chomsky Normal Form (WCNF) to improve performance. ### Experimental Results The experimental results show that the optimized matrix - based CFL - r algorithm significantly outperforms existing tools such as POCR, Graspan, and Gigascale in multiple benchmarks, specifically in terms of speed improvement and memory usage efficiency when processing large - scale graphs. ### Conclusions and Future Work Through these optimizations, the author demonstrates the superior performance of the optimized matrix - based CFL - r algorithm on various problems. Future work will include complexity analysis and the generalization of these optimizations to other algorithms.

Optimization of the Context-Free Language Reachability Matrix-Based Algorithm

The Fine-Grained Complexity of CFL Reachability

Program Analysis via Multiple Context Free Language Reachability

Pearl: A Multi-Derivation Approach to Efficient CFL-Reachability Solving

Collective Communication Optimization for Solving Linear Algebraic Equations

A Scalable CUR Matrix Decomposition Algorithm: Lower Time Complexity and Tighter Bound

Hardware-Software Co-Design of Matrix-Solving for Non-Linear Optimization in SLAM Systems

RTLRewriter: Methodologies for Large Models aided RTL Code Optimization

Optimal Refinement-based Array Constraint Solving for Symbolic Execution

Optimization methods for solving matrix equations

Sparse Matrix Multiplication and Triangle Listing in the Congested Clique Model

Fast Matrix Multiplication Without Tears: A Constraint Programming Approach

Enhancing Scalability of a Matrix-Free Eigensolver for Studying Many-Body Localization

The Linear Algebra Mapping Problem. Current state of linear algebra languages and libraries

Optimizing the Linear Fascicle Evaluation Algorithm for Multi-Core and Many-Core Systems

On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal LU Factorization

Finite Projective Geometry based Fast, Conflict-free Parallel Matrix Computations

Randomized Compression of Rank-Structured Matrices Accelerated with Graph Coloring

Scalable reachability analysis via automated dynamic netlist-based hint generation

Caviar: An E-graph Based TRS for Automatic Code Optimization

Optimizing Sparse Linear Algebra Through Automatic Format Selection and Machine Learning