Abstract:Deep learning typically requires training a very capable architecture using large datasets. However, many important learning problems demand an ability to draw valid inferences from small size datasets, and such problems pose a particular challenge for deep learning. In this regard, various researches on "meta-learning" are being actively conducted. Recent work has suggested a Memory Augmented Neural Network (MANN) for meta-learning. MANN is an implementation of a Neural Turing Machine (NTM) with the ability to rapidly assimilate new data in its memory, and use this data to make accurate predictions. In models such as MANN, the input data samples and their appropriate labels from previous step are bound together in the same memory locations. This often leads to memory interference when performing a task as these models have to retrieve a feature of an input from a certain memory location and read only the label information bound to that location. In this paper, we tried to address this issue by presenting a more robust MANN. We revisited the idea of meta-learning and proposed a new memory augmented neural network by explicitly splitting the external memory into feature and label memories. The feature memory is used to store the features of input data samples and the label memory stores their labels. Hence, when predicting the label of a given input, our model uses its feature memory unit as a reference to extract the stored feature of the input, and based on that feature, it retrieves the label information of the input from the label memory unit. In order for the network to function in this framework, a new memory-writingmodule to encode label information into the label memory in accordance with the meta-learning task structure is designed. Here, we demonstrate that our model outperforms MANN by a large margin in supervised one-shot classification tasks using Omniglot and MNIST datasets.

Labeled Memory Networks for Online Model Adaptation

Meta-Learning via Feature-Label Memory Network

Online Adaptation of Language Models with a Memory of Amortized Contexts

When MAML Can Adapt Fast and How to Assist When It Cannot

AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning

Sparse Meta Networks for Sequential Adaptation and its Application to Adaptive Language Modelling

Augmenting Language Models with Long-Term Memory

Survey on Memory-Augmented Neural Networks: Cognitive Insights to AI Applications

Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

Mitigating Memorization In Language Models

Memory-Augmented Capsule Network for Adaptable Lung Nodule Classification

Memory Aware Synapses: Learning what (not) to forget

One-shot Learning with Memory-Augmented Neural Networks

An Energy-Efficient Architecture for Accelerating Inference of Memory-Augmented Neural Networks

AdaLomo: Low-memory Optimization with Adaptive Learning Rate

Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models

CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models

Efficient Adaptive Optimization via Subset-Norm and Subspace-Momentum: Fast, Memory-Reduced Training with Convergence Guarantees

Online Learning via Memory: Retrieval-Augmented Detector Adaptation

Fast & Slow Learning: Incorporating Synthetic Gradients in Neural Memory Controllers

Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning