Abstract:Computational models that successfully translate neural activity into speech are multiplying in the adult literature, with non-linear convolutional neural network (CNN) approaches joining the more frequently-employed linear and mutual information (MI) models. Despite the promise of these methods for uncovering the neural basis of language acquisition by the human brain, similar studies with infants are rare. Existing infant studies rely on simpler cross-correlation and other linear techniques and aim only to establish neural tracking of the broadband speech envelope. Here, three novel computational models were applied to measure whether low-frequency speech envelope information was encoded in infant neural activity. Backward linear and CNN models were applied to estimate speech information from neural activity using linear versus nonlinear approaches, and a MI model measured how well the acoustic stimuli were encoded in infant neural responses. Fifty infants provided EEG recordings when aged 4, 7, and 11 months, while listening passively to natural speech (sung nursery rhymes) presented by video with a female singer. Each model computed speech information for these nursery rhymes in two different frequency bands, delta (1 – 4 Hz) and theta (4 – 8 Hz), thought to provide different types of linguistic information. All three models demonstrated significant levels of performance for delta-band and theta-band neural activity from 4 months of age. All models also demonstrated higher accuracy for the delta-band neural response in the infant brain. However, only the linear and MI models showed developmental (age-related) effects, and these developmental effects differed by model. Accordingly, the choice of algorithm used to decode speech envelope information from neural activity in the infant brain may determine the developmental conclusions that can be drawn. Better understanding of the strengths and weaknesses of each modelling approach will be fundamental to improving our understanding of how the human brain builds a language system.

Learning to Produce Syllabic Speech Sounds via Reward-Modulated Neural Plasticity

Emergent Jaw Predominance in Vocal Development through Stochastic Optimization

A model of infant speech perception and learning

A computational model of early language acquisition from audiovisual experiences of young infants

Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning

Artificial Vocal Learning Guided by Speech Recognition: What It May Tell Us about How Children Learn to Speak

Statistical Learning in Speech: A Biologically Based Predictive Learning Model

Exploring the effectiveness of reward-based learning strategies for second-language speech sounds

Speech Coding in the Brain: Representation of Vowel Formants by Midbrain Neurons Tuned to Sound Fluctuations1,2,3

Electrophysiological responses reveal a dedicated learning mechanism to process salient consonant sounds in human newborns

Decoding speech information from EEG data with 4, 7 and 11 month-old infants: Contrasting convolutional neural network, mutual information-based and backward linear models

Single-neuronal elements of speech production in humans

On the Emergence of Phonological Knowledge and on Motor Planning and Motor Programming in a Developmental Model of Speech Production

Evaluating computational models of infant phonetic learning across languages

Altricial brains and the evolution of infant vocal learning

The Basis for Language Acquisition: Congenitally Deaf Infants Discriminate Vowel Length in the First Months after Cochlear Implantation

Learning Model-Based F0 Production Through Goal-Directed Babbling

A common neural circuit mechanism for internally guided and externally reinforced forms of motor learning

Learning the sound inventory of a complex vocal skill via an intrinsic reward

Anti‐phasic oscillatory development for speech and noise processing in cochlear implanted toddlers

Modeling early phonetic acquisition from child-centered audio data