Decoding speech information from EEG data with 4, 7 and 11 month-old infants: Contrasting convolutional neural network, mutual information-based and backward linear models

Mahmoud Keshavarzi,Áine Ní Choisdealbha,Adam Attaheri,Sinead Rocha,Perrine Brusini,Samuel Gibbon,Panagiotis Boutris,Natasha Mead,Helen Olawole-Scott,Henna Ahmed,Sheila A. Flanagan,Kanad Mandke,Usha Goswami
DOI: https://doi.org/10.31234/osf.io/a6qfw
2021-11-26
Abstract:Computational models that successfully translate neural activity into speech are multiplying in the adult literature, with non-linear convolutional neural network (CNN) approaches joining the more frequently-employed linear and mutual information (MI) models. Despite the promise of these methods for uncovering the neural basis of language acquisition by the human brain, similar studies with infants are rare. Existing infant studies rely on simpler cross-correlation and other linear techniques and aim only to establish neural tracking of the broadband speech envelope. Here, three novel computational models were applied to measure whether low-frequency speech envelope information was encoded in infant neural activity. Backward linear and CNN models were applied to estimate speech information from neural activity using linear versus nonlinear approaches, and a MI model measured how well the acoustic stimuli were encoded in infant neural responses. Fifty infants provided EEG recordings when aged 4, 7, and 11 months, while listening passively to natural speech (sung nursery rhymes) presented by video with a female singer. Each model computed speech information for these nursery rhymes in two different frequency bands, delta (1 – 4 Hz) and theta (4 – 8 Hz), thought to provide different types of linguistic information. All three models demonstrated significant levels of performance for delta-band and theta-band neural activity from 4 months of age. All models also demonstrated higher accuracy for the delta-band neural response in the infant brain. However, only the linear and MI models showed developmental (age-related) effects, and these developmental effects differed by model. Accordingly, the choice of algorithm used to decode speech envelope information from neural activity in the infant brain may determine the developmental conclusions that can be drawn. Better understanding of the strengths and weaknesses of each modelling approach will be fundamental to improving our understanding of how the human brain builds a language system.
What problem does this paper attempt to address?