Beyond gradients: Factorized, geometric control of interference and generalization

Daniel Nelson Scott,Michael J Frank
DOI: https://doi.org/10.1101/2021.11.19.466943
2024-09-23
Abstract:Interference and generalization, which refer to counter-productive and useful interactions between learning episodes, respectively, are poorly understood in biological neural networks. Whereas much previous work has addressed these topics in terms of specialized brain systems, here we investigated how learning rules should impact them. We found that plasticity between groups of neurons can be decomposed into biologically meaningful factors, with factor geometry controlling interference and generalization. We introduce a "coordinated eligibility theory" in which plasticity is determined according to products of these factors, and is subject to surprise-based metaplasticity. This model computes directional derivatives of loss functions, which need not align with task gradients, allowing it to protect networks against catastrophic interference and facilitate generalization. Because the model's factor structure is closely related to other plasticity rules, and is independent of how feedback is transmitted, it introduces a widely-applicable framework for interpreting supervised, reinforcement-based, and unsupervised plasticity in nervous systems.
Neuroscience
What problem does this paper attempt to address?
The paper attempts to address the issues of interference and generalization mechanisms in biological neural networks. Specifically, the authors focus on how local plasticity rules affect these mechanisms. While previous studies have typically explored these issues through the division of brain systems, this paper investigates how plasticity can be decomposed into biologically meaningful factors among neuron populations from the perspective of learning rules, and how these factors' geometric structures control interference and generalization. ### Main Issues: 1. **Interference**: Refers to the negative impact of new learning tasks on old memories, leading to the degradation or forgetting of old memories. 2. **Generalization**: Refers to the ability of new learning tasks to utilize existing knowledge, improving learning efficiency and adaptability. ### Research Methods: - **Mathematical Decomposition**: The authors found that changes in neural network weights can be decomposed into changes in "receptive fields" (RF) and "population responses" (PR). - **Coordination Eligibility Theory**: A "Coordination Eligibility Theory" is proposed, where plasticity is determined by the product of these factors and is regulated by surprise-based metaplasticity. - **Simulation Experiments**: A series of simulation experiments on supervised learning tasks were conducted to verify the effectiveness of the theory, demonstrating how projecting population responses or receptive field changes into different subspaces can avoid interference and promote generalization. ### Key Findings: - **Gradient Decomposition**: Weight gradients can be decomposed into changes in population responses and receptive fields, which determine the degree of interference and generalization. - **Coordinated Plasticity**: By coordinating changes in population responses and receptive fields, the network can avoid interference and promote generalization in multi-task learning. - **Geometric Paths**: Coordinated plasticity allows the network to move along arbitrary paths in weight space, not just along task gradient directions, overcoming the limitations of relying solely on gradient descent. ### Application Prospects: - **Biological Explanation**: The theory is closely related to existing biological plasticity theories (such as the three-factor rule, BCM model, etc.), with high biological plausibility. - **Broad Applicability**: This framework can be applied to various scenarios such as supervised learning, reinforcement learning, and unsupervised learning, providing a new perspective for understanding plasticity in neural systems. In summary, through mathematical modeling and simulation experiments, this paper delves into the mechanisms of interference and generalization in biological neural networks and proposes a new Coordination Plasticity Theory, providing an important theoretical foundation for understanding and optimizing multi-task learning.