Abstract:At around 7 months of age, human infants begin to reliably produce well-formed syllables containing both consonants and vowels, a behavior called canonical babbling. Over subsequent months, the frequency of canonical babbling continues to increase. How the infant's nervous system supports the acquisition of this ability is unknown. Here we present a computational model that combines a spiking neural network, reinforcement-modulated spike-timing-dependent plasticity, and a human-like vocal tract to simulate the acquisition of canonical babbling. Like human infants, the model's frequency of canonical babbling gradually increases. The model is rewarded when it produces a sound that is more auditorily salient than sounds it has previously produced. This is consistent with data from human infants indicating that contingent adult responses shape infant behavior and with data from deaf and tracheostomized infants indicating that hearing, including hearing one's own vocalizations, is critical for canonical babbling development. Reward receipt increases the level of dopamine in the neural network. The neural network contains a reservoir with recurrent connections and two motor neuron groups, one agonist and one antagonist, which control the masseter and orbicularis oris muscles, promoting or inhibiting mouth closure. The model learns to increase the number of salient, syllabic sounds it produces by adjusting the base level of muscle activation and increasing their range of activity. Our results support the possibility that through dopamine-modulated spike-timing-dependent plasticity, the motor cortex learns to harness its natural oscillations in activity in order to produce syllabic sounds. It thus suggests that learning to produce rhythmic mouth movements for speech production may be supported by general cortical learning mechanisms. The model makes several testable predictions and has implications for our understanding not only of how syllabic vocalizations develop in infancy but also for our understanding of how they may have evolved.

Learning the sound inventory of a complex vocal skill via an intrinsic reward

Exploring the effectiveness of reward-based learning strategies for second-language speech sounds

A common neural circuit mechanism for internally guided and externally reinforced forms of motor learning

A memory-driven auditory program ensures selective and precise vocal imitation in zebra finches

Neural population dynamics in songbird RA and HVC during learned motor-vocal behavior

Neurally driven synthesis of learned, complex vocalizations

Temporal variability enhances vocal learning

Learning to Produce Syllabic Speech Sounds via Reward-Modulated Neural Plasticity

A model of infant speech perception and learning

Learning to Incentivize Other Learning Agents

Noisy Agents: Self-supervised Exploration by Predicting Auditory Events

A hierarchical neuronal model for generation and online recognition of birdsongs

A subcortical circuit linking the cerebellum to the basal ganglia engaged in vocal learning

Chance, long tails, and inference: a non-Gaussian, Bayesian theory of vocal learning in songbirds

LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning.

Finding the Beat: From Socially Coordinated Vocalizations in Songbirds to Rhythmic Entrainment in Humans

Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback

Learning to Communicate Functional States with Nonverbal Expressions for Improved Human-Robot Collaboration

Goal-directed vocal planning in a songbird

Social reinforcement guides operant behaviour and auditory learning in a songbird

Imitation learning of motor primitives and language bootstrapping in robots