Meta-Learning Neural Procedural Biases

Christian Raymond,Qi Chen,Bing Xue,Mengjie Zhan
2024-06-12
Abstract:The goal of few-shot learning is to generalize and achieve high performance on new unseen learning tasks, where each task has only a limited number of examples available. Gradient-based meta-learning attempts to address this challenging task by learning how to learn new tasks by embedding inductive biases informed by prior learning experiences into the components of the learning algorithm. In this work, we build upon prior research and propose Neural Procedural Bias Meta-Learning (NPBML), a novel framework designed to meta-learn task-adaptive procedural biases. Our approach aims to consolidate recent advancements in meta-learned initializations, optimizers, and loss functions by learning them simultaneously and making them adapt to each individual task to maximize the strength of the learned inductive biases. This imbues each learning task with a unique set of procedural biases which is specifically designed and selected to attain strong learning performance in only a few gradient steps. The experimental results show that by meta-learning the procedural biases of a neural network, we can induce strong inductive biases towards a distribution of learning tasks, enabling robust learning performance across many well-established few-shot learning benchmarks.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to quickly adapt to new tasks in few - shot learning. Specifically, the author proposes a new framework - Neural Procedural Bias Meta - Learning (NPBML), aiming to improve the performance of few - shot learning by meta - learning task - adaptive process biases. ### Problem Background In few - shot learning, only a limited number of samples are available for each task, so the model needs to be able to quickly learn from a small amount of data and generalize to new tasks. Traditional gradient - based meta - learning methods such as MAML (Model - Agnostic Meta - Learning) achieve this by learning a shared parameter initialization, but these methods usually use simple gradient descent and a fixed loss function, which limits their ability to achieve optimal performance in only a few gradient updates. ### Core Problem of the Paper The core problem of the paper is: how to simultaneously optimize the initial parameters, the optimizer, and the loss function through meta - learning, so that the model can quickly adapt to new tasks with very limited data and gradient steps. The NPBML framework solves this problem in the following ways: 1. **Meta - learning optimizer**: Through the method of Preconditioned Gradient Descent (PGD), meta - learn a parameterized pre - conditioning matrix \(P_\omega\) to adjust the direction and magnitude of the gradient. 2. **Meta - learning loss function**: Replace the traditional loss function (such as cross - entropy or squared loss) with a meta - learned loss function \(M_\phi\), which can be adjusted according to task - related information. 3. **Task - adaptive modulation**: Through the Feature - wise Linear Modulation (FiLM) layer, enable the initial parameters, the optimizer, and the loss function to be adaptively adjusted for each task, thereby endowing each task with a unique process bias. ### Experimental Verification To verify the effectiveness of NPBML, the author conducted experiments on multiple classic few - shot learning benchmark datasets, including mini - Imagenet, tiered - ImageNet, CIFAR - FS, and FC - 100. The experimental results show that NPBML significantly outperforms the existing MAML and its variant methods on these datasets, especially on high - capacity models (such as ResNet - 12). ### Conclusion NPBML successfully improves the rapid adaptability in few - shot learning by meta - learning task - adaptive process biases, providing a new and effective solution for few - shot learning.