Abstract:Speech development is crucial for a child’s mental growth. Moreover, speech development significantly impacts a child’s educational and professional achievements. It enables the child to interact with the external environment and develop self-awareness and behavioral skills. Thus, the study of the mechanisms of speech development disorders and the development of diagnostic and remediation strategies is essential. Numerous cognitive and neurophysiological investigations into speech and its associated disorders among children are presently being conducted. Electroencephalography (EEG) studies demonstrated constant evoked reactions in response to auditory and visual stimuli associated with speech, including individual phonemes and syllables. Moreover, alterations in these reactions were detected among children with diagnosed speech ailments. The debate surrounding the neurophysiological predictors and correlates of specific speech development disorders continues. The use of isolated “ideal” stimuli and multiple repetitions of a single stimulus, as required by the method of evoked potentials, may create peculiarities in experimental techniques. Thus, brain responses to prolonged, “natural” stimuli may differ from those obtained with isolated stimuli. This could potentially reduce the ecological validity of such studies. In recent years, the temporal response function has become increasingly popular in speech research. This method enables estimating neurophysiological responses to continuous, natural, and ecologically valid stimuli [1–3]. When applied to speech research, this method allows for the study of the brain’s response to changes in acoustic, linguistic, and semantic characteristics present in natural narrative speech [1]. The mathematical basis of the temporal response function (TRF) is the solution of the equation: w=(STS+λE)–1·STR, It is calculated from the stimulus characteristics, represented by the matrix S, the neurophysiological signal corresponding to the stimulus, represented by matrix R, and the temporal response function, represented by matrix w, a matrix of linear transformation coefficients from stimulus space to response space [1]. The TRF serves as a “bridge” between the stimulus and the neurophysiological response as it reflects the neural operations that occur between the two. The S and R matrices are matrices with time lags, enabling estimation of the brain’s response to the presented stimulus within a specific time period. The TRF has been utilized extensively in speech studies [2, 3]. Nevertheless, few studies have used this approach in research that involves children [4, 5]. The use of ecologically valid speech stimuli in child studies simplifies their performance in experimental paradigms and enables the evaluation of brain responses to speech as it occurs in real-life situations, not only in experimentally created conditions. The TRF has various applications to both linguistic and acoustic features of speech, which attracts particular interest in studying the psychophysiological mechanisms of speech development in children with various developmental trajectories. This approach is applied in our study of speech development in children aged 3 to 8 years. Fifty-six children, consisting of 33 boys and 23 girls aged between 3 and 8 years, participated in this study with a mean age of 5.64 (SD=1.33 years). Participants were required to listen to three audio stories, including a children’s story about hedgehogs and adapted versions of the tales “Brick and Wax” and “The Golden Duck”, all of which were recorded by a female voice. All audio stimuli were accompanied by video to maintain children’s attention. The total duration of the stimuli was 15 minutes. The audio stories were presented using Presentation® software from Neurobehavioral Systems, Inc. in Berkeley, CA. The comprehension of the stories was assessed by asking children 8 “yes/no” questions after each story. Furthermore, on a different day of the study, the Preschool Language Scales Fifth Edition (PLS-5) method was used to examine the child’s current level of receptive and expressive speech development. A 32-channel EEG was obtained using a Brain Products actiCHamp (Brain Products GmbH, Gilching, Germany) with reference electrodes positioned at the FCz location. EEG pre-processing was completed with the MNE library for Python, which entailed data filtering between 1 and 15 Hz, visually examining record for any noisy channels, interpolation of deficient channels (as needed), removal of oculomotor artifacts using independent component analysis, and re-referencing the EEG recording to an average electrode. The EEG and stimulus were synchronized by labeling at the start of the stimulus. They were subsequently aligned during specific epochs. Processing was carried out with MATLAB (version 2021b) using the mTRF Toolbox [1]. The Toolbox’s functions were employed to assess the speech stimulus envelope, which was then introduced as input to the TRF. The stimulus and EEG sampling rate were reduced to 128 Hz, and the analysis used a time window ranging from –200 to 800 ms. The TRF prediction coefficient, representing the correlation coefficient between actual data and data predicted by the model post-training and cross-validation, was selected for analysis. The mean value for prediction coefficients across the entire sample was 0.041 (range: –0.002 to 0.106). These coefficients were significantly different from zero (t(55)=13.1, p 0.001). Additionally, a significant positive correlation was found between the prediction coefficients averaged intraindividually across all EEG channels and the age of the participants (r=0.379, p=0.004). The linear model underlying the TRF was able to predict the EEG signal better as the age of the child increased. A significant positive correlation was observed between the prediction coefficient values and the values on the receptive speech scale of the PLS-5 (r=0.33, p=0.026). In addition, PLS-5 scores were strongly correlated with age (r=0.596, p 0.001). There was a positive correlation observed between the model prediction coefficient and the scores obtained from the listening comprehension questionnaire (r=0.39, p=0.012). Additionally, the questionnaire scores were found to be significantly associated with scores from the PLS-5 receptive speech scale (r=0.82, p 0.001) as well as with the age of study participants (r=0.51, p=0.001). Substantively, the predictive coefficient of the temporal response function illustrates the cortical tracking process of the stimulus currently receiving attention and is significantly associated with listening comprehension [2, 3]. Our research indicates a significant and positive correlation between children’s age, their comprehension of speech as measured by the PLS-5 method, and the results of the listening comprehension questionnaire conducted immediately after the experimental task. The prediction coefficient supports this finding. Thus, the use of the temporal response function enables the evaluation of the cerebral cortex’s capacity to follow the acoustic signal of speech in children. Additionally, this approach yields neurophysiological markers of speech reception and comprehension processes. It is feasible to apply an experimental framework to identify neurophysiological correlations of receptive speech across various age groups and participants with varying levels of language and speech skills. The experimental paradigm presented here is a component of research carried out by the Neurobiology of Oral and Written Speech in Developmental Disorders division at the Center for Cognitive Sciences, Sirius University. The authors extend their gratitude to the study participants and project team.

Delta- and theta-band cortical tracking and phase-amplitude coupling to sung speech by infants

Infant low-frequency EEG cortical power, cortical tracking and phase-amplitude coupling predicts language a year later

Cortical tracking of visual rhythmic speech by 5‐ and 8‐month‐old infants: Individual differences in phase angle relate to language outcomes up to 2 years

Examining speech-brain tracking during early bidirectional, free-flowing caregiver-infant interactions

Decoding speech information from EEG data with 4, 7 and 11 month-old infants: Contrasting convolutional neural network, mutual information-based and backward linear models

Acoustic processing of temporally modulated sounds in infants: evidence from a combined near-infrared spectroscopy and EEG study

Decoding speech information from EEG data with 4-, 7- and 11-month-old infants: Using convolutional neural network, mutual information-based and backward linear models

Developmental Trends in Auditory Processing Can Provide Early Predictions of Language Acquisition in Young Infants.

Experience-dependent effects of passive auditory exposure in infants impact theta phase synchrony and predict later language

Neural tracking of natural speech in children in relation to their receptive speech abilities

Neural processing of rhythmic speech by children with developmental language disorder (DLD): An EEG study

Time-domain Analysis of Neural Tracking of Hierarchical Linguistic Structures

Longitudinal trajectories of the neural encoding mechanisms of speech-sound features during the first year of life

Statistical learning beyond words in human neonates

Neural Tracking in Infancy Predicts Language Development in Children With and Without Family History of Autism

Change detection to tone pairs during the first year of life - Predictive longitudinal relationships for EEG-based source and time-frequency measures

Synchronizing with the rhythm: Infant neural entrainment to complex musical and speech stimuli

Neural tracking of natural speech listening in children: temporal response function (TRF) approach

Anti‐phasic oscillatory development for speech and noise processing in cochlear implanted toddlers

Neurodevelopment and asymmetry of auditory-related responses to repetitive syllabic stimuli in preterm neonates based on frequency-domain analysis

The effect of visual speech cues on neural tracking of speech in 10‐month‐old infants