Acoustic Correlates of the Syllabic Rhythm of Speech: Modulation Spectrum or Local Features of the Temporal Envelope

Yuran Zhang,Jiajie Zou,Nai Ding
DOI: https://doi.org/10.1101/2022.07.17.500382
IF: 9.052
2022-01-01
Neuroscience & Biobehavioral Reviews
Abstract:The speech envelope is considered as a major acoustic correlate of the syllable rhythm since the peak frequency in the speech modulation spectrum matches the mean syllable rate. Nevertheless, it has not been quantified whether the peak modulation frequency can track the syllable rate of individual utterances and how much variance of the speech envelope can be explained by the syllable rhythm. Here, we address these problems by analyzing large speech corpora (>1000 hours of recording of multiple languages) using advanced sequence-to-sequence modeling. It is found that, only when averaged over minutes of speech recordings, the peak modulation frequency of speech reliably correlates with the syllable rate of a speaker. In contrast, the phase-locking between speech envelope and syllable onsets is robustly observed within a few seconds of recordings. Based on speaker-independent linear and nonlinear models, the timing of syllable onsets explains about 13% and 46% variance of the speech envelope, respectively. These results demonstrate that local temporal features in the speech envelope precisely encodes the syllable onsets but the modulation spectrum is not always dominated by the syllable rhythm. ### Competing Interest Statement The authors have declared no competing interest.
What problem does this paper attempt to address?