Memorization-Based Training and Testing Paradigm for Robust Vocal Identity Recognition in Expressive Speech Using Event-Related Potentials Analysis

Wenjun Chen,Xiaoming Jiang
DOI: https://doi.org/10.3791/66913
2024-08-09
Abstract:Recognizing familiar speakers from vocal streams is a fundamental aspect of human verbal communication. However, it remains unclear how listeners can still discern the speaker's identity in expressive speech. This study develops a memorization-based individual speaker identity recognition approach and an accompanying electroencephalogram (EEG) data analysis pipeline, which monitors how listeners recognize familiar speakers and tell unfamiliar ones apart. EEG data captures online cognitive processes during new versus old speaker distinction based on voice, offering a real-time measure of brain activity, overcoming limits of reaction times and accuracy measurements. The paradigm comprises three steps: listeners establish associations between three voices and their names (training); listeners indicate the name corresponding to a voice from three candidates (checking); listeners distinguish between three old and three new speaker voices in a two-alternative forced-choice task (testing). The speech prosody in testing was either confident or doubtful. EEG data were collected using a 64-channel EEG system, followed by preprocessing and imported into RStudio for ERP and statistical analysis and MATLAB for brain topography. Results showed an enlarged late positive component (LPC) was elicited in the old-talker compared to the new-talker condition in the 400-850 ms window in the Pz and other wider range of electrodes in both prosodies. Yet, the old/new effect was robust in central and posterior electrodes for doubtful prosody perception, whereas the anterior, central, and posterior electrodes are for confident prosody condition. This study proposes that this experiment design can serve as a reference for investigating speaker-specific cue-binding effects in various scenarios (e.g., anaphoric expression) and pathologies in patients like phonagnosia.
What problem does this paper attempt to address?