DNCs Require More Planning Steps

Yara Shamshoum,Nitzan Hodos,Yuval Sieradzki,Assaf Schuster
2024-06-04
Abstract:Many recent works use machine learning models to solve various complex algorithmic problems. However, these models attempt to reach a solution without considering the problem's required computational complexity, which can be detrimental to their ability to solve it correctly. In this work we investigate the effect of computational time and memory on generalization of implicit algorithmic solvers. To do so, we focus on the Differentiable Neural Computer (DNC), a general problem solver that also lets us reason directly about its usage of time and memory. In this work, we argue that the number of planning steps the model is allowed to take, which we call "planning budget", is a constraint that can cause the model to generalize poorly and hurt its ability to fully utilize its external memory. We evaluate our method on Graph Shortest Path, Convex Hull, Graph MinCut and Associative Recall, and show how the planning budget can drastically change the behavior of the learned algorithm, in terms of learned time complexity, training time, stability and generalization to inputs larger than those seen during training.
Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **When using the Differentiable Neural Computer (DNC) model to solve complex algorithmic problems, how to improve the generalization ability of the model by adjusting the number of planning steps (planning budget)**. Specifically, the author focuses on the impact of computing time and memory on the generalization of implicit algorithm solvers, and proposes that increasing the number of planning steps can significantly improve the model performance. ### Main contributions of the paper 1. **Explore the impact of computational complexity on DNC**: The author re - examines DNC and its similar models from the perspective of computational complexity, showing that choosing the correct planning budget is crucial for the generalization ability of the model on different algorithmic tasks. 2. **Challenge the standard planning budget**: The research results show that the traditional fixed planning budget (such as \(p(n) = 10\)) has limitations, and choosing an appropriate planning budget can greatly improve the model performance. 3. **Empirical evidence**: Through experiments on multiple algorithmic problems (such as the shortest path, minimum cut, convex hull, and associative recall), strong empirical evidence is provided, proving that the planning budget has a significant impact on the learned algorithmic behavior. 4. **Solve the external memory expansion problem**: Identify and solve the performance degradation problem that occurs when expanding the external memory of DNC to support larger inputs, and propose new techniques to overcome this challenge. 5. **Improve training stability**: Propose a new method that combines random planning budgets to encourage the learning of more abstract algorithms, thereby improving the generalization effect. ### Main methods - **Adaptive planning budget**: Set the planning budget as \(p(n)\) according to the input scale \(n\), for example, a linear function \(p(n)=n\) or other forms of functions. - **External memory expansion and re - weighting**: Solve the problem of performance degradation when expanding the external memory by introducing the temperature recalibration parameter \(\tau\). - **Experimental evaluation**: Conduct experiments on multiple algorithmic tasks to compare the impact of different planning budgets on the model's generalization ability and training efficiency. ### Conclusion The paper proves through experiments that appropriately increasing the number of planning steps of DNC can significantly improve its generalization ability when solving complex algorithmic problems. In addition, the proposed external memory expansion techniques and training stability improvement methods also provide valuable references for future research.