Abstract:Artificial Neural Networks has struggled to devise a way to incorporate working memory into neural networks. While the ``long term'' memory can be seen as the learned weights, the working memory consists likely more of dynamical activity, that is missing from feed-forward models. Current state of the art models such as transformers tend to ``solve'' this by ignoring working memory entirely and simply process the sequence as an entire piece of data; however this means the network cannot process the sequence in an online fashion, and leads to an immense explosion in memory requirements. Here, inspired by a combination of controls, reservoir computing, deep learning, and recurrent neural networks, we offer an alternative paradigm that combines the strength of recurrent networks, with the pattern matching capability of feed-forward neural networks, which we call the \textit{Maelstrom Networks} paradigm. This paradigm leaves the recurrent component - the \textit{Maelstrom} - unlearned, and offloads the learning to a powerful feed-forward network. This allows the network to leverage the strength of feed-forward training without unrolling the network, and allows for the memory to be implemented in new neuromorphic hardware. It endows a neural network with a sequential memory that takes advantage of the inductive bias that data is organized causally in the temporal domain, and imbues the network with a state that represents the agent's ``self'', moving through the environment. This could also lead the way to continual learning, with the network modularized and ``'protected'' from overwrites that come with new data. In addition to aiding in solving these performance problems that plague current non-temporal deep networks, this also could finally lead towards endowing artificial networks with a sense of ``self''.

Relational recurrent neural networks

Recurrent Relational Networks

Working Memory Networks: Augmenting Memory Networks with a Relational Reasoning Module

Relational Neural Machines

Distributed Associative Memory Network with Memory Refreshing Loss

RelNN: A Deep Neural Model for Relational Learning

Relational Neural Markov Random Fields

Memory and Information Processing in Recurrent Neural Networks

Ordered Memory.

ReCo: A Modular Neural Framework for Automatically Recommending Connections in Software Models

Recurrent Aggregators in Neural Algorithmic Reasoning

Transformer-Style Relational Reasoning with Dynamic Memory Updating for Temporal Network Modeling

Relational-Grid-World: A Novel Relational Reasoning Environment and An Agent Model for Relational Information Extraction

Recurrent Memory Networks for Language Modeling

Recurrent Reinforcement Learning with Memoroids

Neural Models for Reasoning over Multiple Mentions Using Coreference

Recurrent Network Models Of Sequence Generation And Memory

Recurrent Neural Networks with External Addressable Long-Term and Working Memory for Learning Long-Term Dependences.

Maelstrom Networks

On extended long short-term memory and dependent bidirectional recurrent neural network