Temporal Modulations Reveal Distinct Rhythmic Properties of Speech and Music

Nai Ding,Aniruddh D. Patel,Lin Chen,Henry Butler,Cheng Luo,David Poeppel
DOI: https://doi.org/10.1101/059683
IF: 9.052
2016-01-01
Neuroscience & Biobehavioral Reviews
Abstract:Speech and music have structured rhythms, but these rhythms are rarely compared empirically. This study, based on large corpora, quantitatively characterizes and compares a major acoustic correlate of spoken and musical rhythms, the slow (0.25-32 Hz) temporal modulations in sound intensity. We show that the speech modulation spectrum is highly consistent cross 9 languages (including languages with typologically different rhythmic characteristics, such as English, French, and Mandarin Chinese). A different, but similarly consistent modulation spectrum is observed for Western classical music played by 6 different instruments. Western music, including classical music played by single instruments, symphonic, jazz, and rock music, contains more energy than speech in the low modulation frequency range below 4 Hz. The temporal modulations of speech and music show broad but well-separated peaks around 5 and 2 Hz, respectively. These differences in temporal modulations alone, without any spectral details, can discriminate speech and music with high accuracy. Speech and music therefore show distinct and reliable statistical regularities in their temporal modulations that likely facilitate their perceptual analysis and its neural foundations.
What problem does this paper attempt to address?