Thomas Miconi,Jeff Clune,Kenneth O. Stanley
Abstract:How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. First, recurrent plastic networks with more than two million parameters can be trained to memorize and reconstruct sets of novel, high-dimensional 1000+ pixels natural images not seen during training. Crucially, traditional non-plastic recurrent networks fail to solve this task. Furthermore, trained plastic networks can also solve generic meta-learning tasks such as the Omniglot task, with competitive results and little parameter overhead. Finally, in reinforcement learning settings, plastic networks outperform a non-plastic equivalent in a maze exploration task. We conclude that differentiable plasticity may provide a powerful novel approach to the learning-to-learn problem.
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve
The paper aims to address the problem of how to build agents that can continue to learn quickly and efficiently from experience after initial training. Specifically, the paper focuses on the problem of "learning to learn."
#### Background and Motivation
1. **Success of Machine Learning**: In recent years, machine learning has achieved significant success in learning single complex tasks from a large number of training samples. However, once learning is complete, the agent's knowledge becomes fixed; if it needs to be applied to different tasks, it requires retraining, which also demands a large number of new training samples.
2. **Advantages of Biological Intelligence**: In contrast, biological agents (such as animals) can learn quickly and efficiently from continuous experience, such as navigating, remembering the location of food sources, discovering and remembering the reward or punishment properties of new objects or situations, often learning with just one exposure.
3. **Importance of Lifelong Learning**: Endowing artificial agents with the ability to learn throughout their lifetime is crucial for mastering environments with changing or unpredictable characteristics, or specific features that cannot be foreseen during training. For example, supervised learning can allow neural networks to recognize letters in a specific fixed alphabet, but autonomous learning capabilities can enable agents to acquire knowledge of any alphabet, including those unknown to human designers during training.
4. **Meta-Learning Methods**: Several meta-learning methods have been proposed to train agents to learn autonomously. However, unlike current methods, long-term learning in biological brains is primarily achieved through synaptic plasticity, where the strength of connections between neurons is enhanced or weakened based on neural activity. This has been finely tuned through millions of years of evolution to support efficient learning within an individual's lifetime.
#### Research Objectives
1. **Differentiable Plasticity**: The paper proposes a method called "differentiable plasticity," which optimizes synaptic plasticity in large (millions of parameters) recurrent networks through gradient descent, enabling efficient lifelong learning.
2. **Experimental Validation**: The paper validates the effectiveness of this method through three different types of tasks:
- **Complex Pattern Memory**: Including memory and reconstruction of natural images.
- **One-Shot Classification**: Performing one-shot classification tasks on the Omniglot dataset.
- **Reinforcement Learning**: Conducting reinforcement learning in maze exploration tasks.
3. **Performance Comparison**: The results show that plasticity networks perform excellently in these tasks, especially in complex pattern memory tasks, where their performance far exceeds traditional non-plastic recurrent networks (such as LSTM).
### Summary
By introducing differentiable plasticity, the paper aims to solve the problem of how to enable artificial agents to continue learning quickly and efficiently from experience after initial training. Through experimental validation, this method has shown excellent performance in various tasks, demonstrating its potential in the field of meta-learning.