Auditory attention decoding from EEG-based Mandarin speech envelope reconstruction

Zihao Xu,Yanru Bai,Ran Zhao,Qi Zheng,Guangjian Ni,Dong Ming
DOI: https://doi.org/10.1016/j.heares.2022.108552
IF: 3.672
2022-09-01
Hearing Research
Abstract:In the cocktail party circumstance, the human auditory system extracts the information from a specific speaker of interest and ignores others. Many studies have focused on auditory attention decoding (AAD), but the stimulation materials were mainly non-tonal languages. We used a tonal language (Mandarin) as the speech stimulus and constructed a Long Short-Term Memory (LSTM) architecture for speech envelope reconstruction based on electroencephalogram (EEG) data. The correlation coefficient between the reconstructed and candidate envelopes was calculated to determine the subject's auditory attention. The proposed LSTM architecture outperformed the linear models. The average decoding accuracy in cross-subject and inter-subject cases varies from 63.02 to 74.29%, with the highest accuracy rate of 89.1% in a decision window of 0.15 s. In addition, the beta-band rhythm was found to play an essential role in identifying the attention and the non-attention state. These results provide a new AAD architecture to help develop neuro-steered hearing devices, especially for tonal languages.
neurosciences,otorhinolaryngology,audiology & speech-language pathology
What problem does this paper attempt to address?