Abstract:Transformers have revolutionized machine learning models of language and vision, but their connection with neuroscience remains tenuous. Built from attention layers, they require a mass comparison of queries and keys that is difficult to perform using traditional neural circuits. Here, we show that neurons can implement attention-like computations using short-term, Hebbian synaptic potentiation. We call our mechanism the match-and-control principle and it proposes that when activity in an axon is synchronous, or matched, with the somatic activity of a neuron that it synapses onto, the synapse can be briefly strongly potentiated, allowing the axon to take over, or control, the activity of the downstream neuron for a short time. In our scheme, the keys and queries are represented as spike trains and comparisons between the two are performed in individual spines allowing for hundreds of key comparisons per query and roughly as many keys and queries as there are neurons in the network. Many of the most impressive recent advances in machine learning, from generating images from text to human-like chatbots, are based on a neural network architecture known as the transformer. Transformers are built from so-called attention layers which perform large numbers of comparisons between the vector outputs of the previous layers, allowing information to flow through the network in a more dynamic way than previous designs. This large number of comparisons is computationally expensive and has no known analogue in the brain. Here, we show that a variation on a learning mechanism familiar in neuroscience, Hebbian learning, can implement a transformer-like attention computation if the synaptic weight changes are large and rapidly induced. We call our method the match-and-control principle and it proposes that when presynaptic and postsynaptic spike trains match up, small groups of synapses can be transiently potentiated allowing a few presynaptic axons to control the activity of a neuron. To demonstrate the principle, we build a model of a pyramidal neuron and use it to illustrate the power and limitations of the idea.

The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning

Rewiring Neurons in Non-Stationary Environments

Robust Transcoding Sensory Information with Neural Spikes

Neuron-level prediction and noise can implement flexible reward-seeking behavior

Short-term Hebbian learning can implement transformer-like attention

Adaptive behavior with stable synapses

Exploration Of A Mechanism To Form Bionic, Self-Growing And Self-Organizing Neural Network

Neural networks with motivation

Sensory Coding with Dynamically Competitive Networks

Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning

Salience-Affected Neural Networks

Structurally Flexible Neural Networks: Evolving the Building Blocks for General Agents

Generalizability Under Sensor Failure: Tokenization + Transformers Enable More Robust Latent Spaces

A Goal-Driven Approach to Systems Neuroscience

A Neural Network Model of Visual Attention Integrating Biased Competition and Reinforcement Learning

On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

IC Neuron: an Efficient Unit to Construct Neural Networks

Learning Transform Invariant Object Recognition in the Visual System with Multiple Stimuli Present During Training.

Under the Hood of Neural Networks: Characterizing Learned Representations by Functional Neuron Populations and Network Ablations

A Spiking Neural Network Model of Model-Free Reinforcement Learning with High-Dimensional Sensory Input and Perceptual Ambiguity

Motif-Topology and Reward-Learning Improved Spiking Neural Network for Efficient Multi-Sensory Integration