Beyond gradients: Factorized, geometric control of interference and generalization

Daniel Nelson Scott,Michael J Frank

DOI: https://doi.org/10.1101/2021.11.19.466943

2024-09-23

Abstract:Interference and generalization, which refer to counter-productive and useful interactions between learning episodes, respectively, are poorly understood in biological neural networks. Whereas much previous work has addressed these topics in terms of specialized brain systems, here we investigated how learning rules should impact them. We found that plasticity between groups of neurons can be decomposed into biologically meaningful factors, with factor geometry controlling interference and generalization. We introduce a "coordinated eligibility theory" in which plasticity is determined according to products of these factors, and is subject to surprise-based metaplasticity. This model computes directional derivatives of loss functions, which need not align with task gradients, allowing it to protect networks against catastrophic interference and facilitate generalization. Because the model's factor structure is closely related to other plasticity rules, and is independent of how feedback is transmitted, it introduces a widely-applicable framework for interpreting supervised, reinforcement-based, and unsupervised plasticity in nervous systems.

Neuroscience

What problem does this paper attempt to address?

The paper attempts to address the issues of interference and generalization mechanisms in biological neural networks. Specifically, the authors focus on how local plasticity rules affect these mechanisms. While previous studies have typically explored these issues through the division of brain systems, this paper investigates how plasticity can be decomposed into biologically meaningful factors among neuron populations from the perspective of learning rules, and how these factors' geometric structures control interference and generalization. ### Main Issues: 1. **Interference**: Refers to the negative impact of new learning tasks on old memories, leading to the degradation or forgetting of old memories. 2. **Generalization**: Refers to the ability of new learning tasks to utilize existing knowledge, improving learning efficiency and adaptability. ### Research Methods: - **Mathematical Decomposition**: The authors found that changes in neural network weights can be decomposed into changes in "receptive fields" (RF) and "population responses" (PR). - **Coordination Eligibility Theory**: A "Coordination Eligibility Theory" is proposed, where plasticity is determined by the product of these factors and is regulated by surprise-based metaplasticity. - **Simulation Experiments**: A series of simulation experiments on supervised learning tasks were conducted to verify the effectiveness of the theory, demonstrating how projecting population responses or receptive field changes into different subspaces can avoid interference and promote generalization. ### Key Findings: - **Gradient Decomposition**: Weight gradients can be decomposed into changes in population responses and receptive fields, which determine the degree of interference and generalization. - **Coordinated Plasticity**: By coordinating changes in population responses and receptive fields, the network can avoid interference and promote generalization in multi-task learning. - **Geometric Paths**: Coordinated plasticity allows the network to move along arbitrary paths in weight space, not just along task gradient directions, overcoming the limitations of relying solely on gradient descent. ### Application Prospects: - **Biological Explanation**: The theory is closely related to existing biological plasticity theories (such as the three-factor rule, BCM model, etc.), with high biological plausibility. - **Broad Applicability**: This framework can be applied to various scenarios such as supervised learning, reinforcement learning, and unsupervised learning, providing a new perspective for understanding plasticity in neural systems. In summary, through mathematical modeling and simulation experiments, this paper delves into the mechanisms of interference and generalization in biological neural networks and proposes a new Coordination Plasticity Theory, providing an important theoretical foundation for understanding and optimizing multi-task learning.

Beyond gradients: Factorized, geometric control of interference and generalization

Evolving interpretable plasticity for spiking networks

Evolving Decomposed Plasticity Rules for Information-Bottlenecked Meta-Learning

Synergies Between Intrinsic and Synaptic Plasticity Based on Information Theoretic Learning

Balancing complexity, performance and plausibility to meta learn plasticity rules in recurrent spiking networks.

Differentiable plasticity: training plastic neural networks with backpropagation

Critical neural networks with short and long term plasticity

Neuroplastic Expansion in Deep Reinforcement Learning

Discovering plasticity rules that organize and maintain neural circuits

Using local plasticity rules to train recurrent neural networks

Model Based Inference of Synaptic Plasticity Rules

Beyond spiking networks: The computational advantages of dendritic amplification and input segregation

Learning the Plasticity: Plasticity-Driven Learning Framework in Spiking Neural Networks

Evolving Generalized Modulatory Learning: Unifying Neuromodulation and Synaptic Plasticity

Toward the Emergence of Intelligent Control: Episodic Generalization and Optimization

Understanding plasticity in neural networks

Hebbian and Gradient-based Plasticity Enables Robust Memory and Rapid Learning in RNNs

One Step Back, Two Steps Forward: Interference and Learning in Recurrent Neural Networks

Neuron-centric Hebbian Learning

Learning what matters: Synaptic plasticity with invariance to second-order input correlations

Disentangling the Causes of Plasticity Loss in Neural Networks