Abstract:The circuitry and pathways in the brains of humans and other species have long inspired researchers and system designers to develop accurate and efficient systems capable of solving real-world problems and responding in real-time. We propose the Syllable-Specific Temporal Encoding (SSTE) to learn vocal sequences in a reservoir of Izhikevich neurons, by forming associations between exclusive input activities and their corresponding syllables in the sequence. Our model converts the audio signals to cochleograms using the CAR-FAC model to simulate a brain-like auditory learning and memorization process. The reservoir is trained using a hardware-friendly approach to FORCE learning. Reservoir computing could yield associative memory dynamics with far less computational complexity compared to RNNs. The SSTE-based learning enables competent accuracy and stable recall of spatiotemporal sequences with fewer reservoir inputs compared with existing encodings in the literature for similar purpose, offering resource savings. The encoding points to syllable onsets and allows recalling from a desired point in the sequence, making it particularly suitable for recalling subsets of long vocal sequences. The SSTE demonstrates the capability of learning new signals without forgetting previously memorized sequences and displays robustness against occasional noise, a characteristic of real-world scenarios. The components of this model are configured to improve resource consumption and computational intensity, addressing some of the cost-efficiency issues that might arise in future implementations aiming for compactness and real-time, low-power operation. Overall, this model proposes a brain-inspired pattern generation network for vocal sequences that can be extended with other bio-inspired computations to explore their potentials for brain-like auditory perception. Future designs could inspire from this model to implement embedded devices that learn vocal sequences and recall them as needed in real-time. Such systems could acquire language and speech, operate as artificial assistants, and transcribe text to speech, in the presence of natural noise and corruption on audio data.

SSTE: Syllable-Specific Temporal Encoding to FORCE-learn audio sequences with an associative memory approach

Adaptive Temporal Encoding Leads to a Background-Insensitive Cortical Representation of Speech

A Brain-Inspired Spiking Neural Network Model with Temporal Encoding and Learning

Rapid Feedforward Computation by Temporal Encoding and Learning with Spiking Neurons

ELiSe: Efficient Learning of Sequences in Structured Recurrent Networks

Learning Real-World Stimuli By Single-Spike Coding And Tempotron Rule

A Spiking Neural Network System for Robust Sequence Recognition

Coherent noise enables probabilistic sequence replay in spiking neuronal networks

Pattern Recognition Computation in a Spiking Neural Network with Temporal Encoding and Learning

Volitional Modulation of Temporal Spiking Patterns Uncovers the Ability of Temporal Coding in Abstract Skills Learning

Brain Inspired Sequences Production by Spiking Neural Networks With Reward-Modulated STDP

Event-driven online-learning using the CAR-FAC cochlea model

An Oscillator Ensemble Model of Sequence Learning

Spike-Enabled Audio Learning in Multilevel Synaptic Memristor Array-Based Spiking Neural Network

Learning spatiotemporal signals using a recurrent spiking network that discretizes time

An Efficient and Perceptually Motivated Auditory Neural Encoding and Decoding Algorithm for Spiking Neural Networks

Organizing Sequential Memory in a Neuromorphic Device Using Dynamic Neural Fields

Neuromorphic Auditory Perception by Neural Spiketrum

ElectrodeNet -- A Deep Learning Based Sound Coding Strategy for Cochlear Implants

Efficient Speech Command Recognition Leveraging Spiking Neural Network and Curriculum Learning-based Knowledge Distillation