Abstract:Optimizing data movements during program executions is essential for achieving high performance in modern computing systems. This has been classically modeled with the Red-Blue Pebble Game and its variants. In the existing models, it is typically assumed that the number of red pebbles, i.e., the size of the fast memory, is larger than the maximum in-degree in the computational graph (e.g. an arithmetic circuit). This assumption can be restrictive for many real applications, especially when dealing with "big data" in Machine Learning and Scientific Computing. In this work we study a generalization of the original Red-Blue Pebble Game to allow arbitrary in-degrees, that can be larger than the size of the fast memory. The objective is to minimize the I/O operations by allowing the computation of partial results in the fast memory. We show that this variant of the problem is NP-complete, even for the special case where the computational graph consists of a single level, and only two words fit in the fast memory. Approximation algorithms for a couple of special cases are also outlined.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to solve the problem of optimizing data movement during program execution in modern computing systems, especially when the size of fast memory (such as cache) is limited. Specifically, the paper mainly focuses on: 1. **Extending the classic red - blue pebble game model**: - The traditional red - blue pebble game assumes that the number of red pebbles in fast memory (i.e., the size of fast memory) is greater than the maximum in - degree of the computation graph (such as an arithmetic circuit). This assumption is too restrictive when dealing with large - scale data (e.g., "big data" in machine learning and scientific computing). - The paper proposes a new variant of the red - blue pebble game that allows for an arbitrarily large in - degree, even if these in - degrees exceed the size of fast memory. The goal is to minimize input / output (I/O) operations by allowing partial computations in fast memory. 2. **Proving the complexity of the problem**: - The author proves that even in simplified cases (e.g., the computation graph contains only a single layer and the fast memory can only hold two elements), this problem is still NP - complete. 3. **Designing approximation algorithms**: - For some special cases, the author proposes approximation algorithms. For example, by reducing the problem to the traveling salesman problem (TSP) with a specific set of distances (such as {1, 2} or {1, 2, 3}) and using methods such as the Christofides algorithm to find near - optimal solutions. ### Formulas and symbol explanations - \(M\): Represents the size of fast memory. - \(B\): Represents the size of each data block loaded (cache - line size). - \(G=(V, E)\): Represents a computational directed acyclic graph (DAG), where \(V\) is the set of nodes and \(E\) is the set of edges. - \(T_M\): Represents the set of all possible graph transformations such that the maximum in - degree of each node does not exceed \(M\). - \(OPT_T\): Represents the cost of the optimal pebble game strategy for a certain transformation \(T\in T_M\). - \(T^* = \arg\min_{T\in T_M}OPT_T\): Represents the transformation that minimizes the optimal cost. ### Conclusions By extending the red - blue pebble game model, the paper overcomes the limitations of existing models when dealing with large - scale data and proves the complexity of the problem. In addition, the author also designs approximation algorithms for some special cases, thus providing theoretical support and technical means for practical applications.

I/O complexity and pebble games with partial computations

Red-Blue Pebbling with Multiple Processors: Time, Communication and Memory Trade-offs

On the Hardness of Red-Blue Pebble Games

Inapproximability of the Standard Pebble Game and Hard to Pebble Graphs

Tight Bounds on the Spooky Pebble Game: Recycling Qubits with Measurements

Reversible Simulation of Irreversible Computation by Pebble Games

Hardness of Approximation in PSPACE and Separation Results for Pebble Games

New Tools for Peak Memory Scheduling

On the I/O Complexity of the CYK Algorithm and of a Family of Related DP Algorithms

Tightening I/O Lower Bounds through the Hourglass Dependency Pattern

On the Complexity of Solving Subtraction Games

Bounded-Memory Strategies in Partial-Information Games

Evaluating Rational Functions: Infinite Precision is Finite Cost and Tractable on Average

Modeling Precomputation In Games Played Under Computational Constraints

Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes

The DAG Visit approach for Pebbling and I/O Lower Bounds

Automating weight function generation in graph pebbling

The Complexity of Online Graph Games

Reversible Pebbling Game for Quantum Memory Management

The Computational Complexity of Single-Player Imperfect-Recall Games

Quasar 3C298: a test-case for meteoritic nanodiamond 3.5 microns emission