Abstract:Many recent works use machine learning models to solve various complex algorithmic problems. However, these models attempt to reach a solution without considering the problem's required computational complexity, which can be detrimental to their ability to solve it correctly. In this work we investigate the effect of computational time and memory on generalization of implicit algorithmic solvers. To do so, we focus on the Differentiable Neural Computer (DNC), a general problem solver that also lets us reason directly about its usage of time and memory. In this work, we argue that the number of planning steps the model is allowed to take, which we call "planning budget", is a constraint that can cause the model to generalize poorly and hurt its ability to fully utilize its external memory. We evaluate our method on Graph Shortest Path, Convex Hull, Graph MinCut and Associative Recall, and show how the planning budget can drastically change the behavior of the learned algorithm, in terms of learned time complexity, training time, stability and generalization to inputs larger than those seen during training.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is: **When using the Differentiable Neural Computer (DNC) model to solve complex algorithmic problems, how to improve the generalization ability of the model by adjusting the number of planning steps (planning budget)**. Specifically, the author focuses on the impact of computing time and memory on the generalization of implicit algorithm solvers, and proposes that increasing the number of planning steps can significantly improve the model performance. ### Main contributions of the paper 1. **Explore the impact of computational complexity on DNC**: The author re - examines DNC and its similar models from the perspective of computational complexity, showing that choosing the correct planning budget is crucial for the generalization ability of the model on different algorithmic tasks. 2. **Challenge the standard planning budget**: The research results show that the traditional fixed planning budget (such as \(p(n) = 10\)) has limitations, and choosing an appropriate planning budget can greatly improve the model performance. 3. **Empirical evidence**: Through experiments on multiple algorithmic problems (such as the shortest path, minimum cut, convex hull, and associative recall), strong empirical evidence is provided, proving that the planning budget has a significant impact on the learned algorithmic behavior. 4. **Solve the external memory expansion problem**: Identify and solve the performance degradation problem that occurs when expanding the external memory of DNC to support larger inputs, and propose new techniques to overcome this challenge. 5. **Improve training stability**: Propose a new method that combines random planning budgets to encourage the learning of more abstract algorithms, thereby improving the generalization effect. ### Main methods - **Adaptive planning budget**: Set the planning budget as \(p(n)\) according to the input scale \(n\), for example, a linear function \(p(n)=n\) or other forms of functions. - **External memory expansion and re - weighting**: Solve the problem of performance degradation when expanding the external memory by introducing the temperature recalibration parameter \(\tau\). - **Experimental evaluation**: Conduct experiments on multiple algorithmic tasks to compare the impact of different planning budgets on the model's generalization ability and training efficiency. ### Conclusion The paper proves through experiments that appropriately increasing the number of planning steps of DNC can significantly improve its generalization ability when solving complex algorithmic problems. In addition, the proposed external memory expansion techniques and training stability improvement methods also provide valuable references for future research.

DNCs Require More Planning Steps

Exploiting Problem Structure in Deep Declarative Networks: Two Case Studies

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Optimizing DNN computation graph using graph substitutions

Estimating Minimum Operation Steps Via Memory-based Recurrent Calculation Network

Hybrid computing using a neural network with dynamic external memory

Computational Issues in Time-Inconsistent Planning

Training Overparametrized Neural Networks in Sublinear Time

Predicting the Computational Cost of Deep Learning Models

Quality and Cost of Deterministic Network Calculus

What to Do When Your Discrete Optimization Is the Size of a Neural Network?

Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory

Optimal DNN Primitive Selection with Partitioned Boolean Quadratic Programming

DSA: More Efficient Budgeted Pruning via Differentiable Sparsity Allocation

Bottleneck Analysis of Dynamic Graph Neural Network Inference on CPU and GPU

Neural-Guided RuntimePrediction of Planners for Improved Motion and Task Planning with Graph Neural Networks

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Neural Networks for Predicting Algorithm Runtime Distributions

Decision-Focused Learning to Predict Action Costs for Planning

Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints