Heterogenous Memory Augmented Neural Networks

Zihan Qiu,Zhen Liu,Shuicheng Yan,Shanghang Zhang,Jie Fu
2023-10-17
Abstract:It has been shown that semi-parametric methods, which combine standard neural networks with non-parametric components such as external memory modules and data retrieval, are particularly helpful in data scarcity and out-of-distribution (OOD) scenarios. However, existing semi-parametric methods mostly depend on independent raw data points - this strategy is difficult to scale up due to both high computational costs and the incapacity of current attention mechanisms with a large number of tokens. In this paper, we introduce a novel heterogeneous memory augmentation approach for neural networks which, by introducing learnable memory tokens with attention mechanism, can effectively boost performance without huge computational overhead. Our general-purpose method can be seamlessly combined with various backbones (MLP, CNN, GNN, and Transformer) in a plug-and-play manner. We extensively evaluate our approach on various image and graph-based tasks under both in-distribution (ID) and OOD conditions and show its competitive performance against task-specific state-of-the-art methods. Code is available at \url{<a class="link-external link-https" href="https://github.com/qiuzh20/HMA" rel="external noopener nofollow">this https URL</a>}.
Machine Learning
What problem does this paper attempt to address?