Disentangling the Causes of Plasticity Loss in Neural Networks

Clare Lyle,Zeyu Zheng,Khimya Khetarpal,Hado van Hasselt,Razvan Pascanu,James Martens,Will Dabney

2024-02-29

Abstract:Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a \textit{stationary} data distribution. In settings where this assumption is violated, e.g.\ deep reinforcement learning, learning algorithms become unstable and brittle with respect to hyperparameters and even random seeds. One factor driving this instability is the loss of plasticity, meaning that updating the network's predictions in response to new information becomes more difficult as training progresses. While many recent works provide analyses and partial solutions to this phenomenon, a fundamental question remains unanswered: to what extent do known mechanisms of plasticity loss overlap, and how can mitigation strategies be combined to best maintain the trainability of a network? This paper addresses these questions, showing that loss of plasticity can be decomposed into multiple independent mechanisms and that, while intervening on any single mechanism is insufficient to avoid the loss of plasticity in all cases, intervening on multiple mechanisms in conjunction results in highly robust learning algorithms. We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks, and further demonstrate its effectiveness on naturally arising nonstationarities, including reinforcement learning in the Arcade Learning Environment.

Machine Learning

What problem does this paper attempt to address?

The paper discusses the problem of plasticity loss in neural networks when trained on non-stationary data distributions. Plasticity loss results in difficulties in adapting the network's predictions to new information, especially in deep reinforcement learning, making the algorithm unstable and fragile to hyperparameters and random seeds. The researchers found that plasticity loss can be decomposed into multiple independent mechanisms, and by combining various intervention strategies, the robustness of the learning algorithm can be significantly improved. The paper suggests that a combination of layer normalization and weight decay is effective in various synthetic non-stationary learning tasks, as well as in naturally occurring non-stationary situations such as reinforcement learning in the Arcade Learning Environment. The study further analyzes different types of non-stationarity that cause plasticity loss, changes in network parameters and feature structures, and shared properties among networks that lose plasticity. They identify known mechanisms such as dead units, as well as new mechanisms such as unit linearization. The study also emphasizes the impact of target value magnitude on plasticity loss in regression tasks. While a single intervention may not be sufficient to prevent plasticity loss in all cases, combining multiple mechanisms can significantly reduce plasticity loss in various benchmark tests. Overall, the paper aims to establish a model to guide the development of more effective methods for maintaining plasticity and avoiding stability issues in the optimization process of non-stationary learning problems.

Disentangling the Causes of Plasticity Loss in Neural Networks

Understanding plasticity in neural networks

Plasticity Loss in Deep Reinforcement Learning: A Survey

Overcoming Long-Term Catastrophic Forgetting Through Adversarial Neural Pruning and Synaptic Consolidation

Deep Reinforcement Learning with Plasticity Injection

Loss of plasticity in deep continual learning

Neural Network Plasticity and Loss Sharpness

Keep Moving: identifying task-relevant subspaces to maximise plasticity for newly learned tasks

Maintaining Plasticity in Deep Continual Learning

Can We Understand Plasticity Through Neural Collapse?

A study on the plasticity of neural networks

Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages

Neuroplastic Expansion in Deep Reinforcement Learning

A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning

Maintaining Plasticity via Regenerative Regularization

Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks

Self-Normalized Resets for Plasticity in Continual Learning

DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity

Entropy-based Stability-Plasticity for Lifelong Learning

Differentiable plasticity: training plastic neural networks with backpropagation

Maintaining Plasticity in Continual Learning via Regenerative Regularization