Using Ear-EEG to Decode Auditory Attention in Multiple-speaker Environment

Haolin Zhu,Yujie Yan,Xiran Xu,Zhongshu Ge,Pei Tian,Xihong Wu,Jing Chen
2024-09-13
Abstract:Auditory Attention Decoding (AAD) can help to determine the identity of the attended speaker during an auditory selective attention task, by analyzing and processing measurements of electroencephalography (EEG) data. Most studies on AAD are based on scalp-EEG signals in two-speaker scenarios, which are far from real application. Ear-EEG has recently gained significant attention due to its motion tolerance and invisibility during data acquisition, making it easy to incorporate with other devices for applications. In this work, participants selectively attended to one of the four spatially separated speakers' speech in an anechoic room. The EEG data were concurrently collected from a scalp-EEG system and an ear-EEG system (cEEGrids). Temporal response functions (TRFs) and stimulus reconstruction (SR) were utilized using ear-EEG data. Results showed that the attended speech TRFs were stronger than each unattended speech and decoding accuracy was 41.3\% in the 60s (chance level of 25\%). To further investigate the impact of electrode placement and quantity, SR was utilized in both scalp-EEG and ear-EEG, revealing that while the number of electrodes had a minor effect, their positioning had a significant influence on the decoding accuracy. One kind of auditory spatial attention detection (ASAD) method, STAnet, was testified with this ear-EEG database, resulting in 93.1% in 1-second decoding window. The implementation code and database for our work are available on GitHub: <a class="link-external link-https" href="https://github.com/zhl486/Ear_EEG_code.git" rel="external noopener nofollow">this https URL</a> and Zenodo: <a class="link-external link-https" href="https://zenodo.org/records/10803261" rel="external noopener nofollow">this https URL</a>.
Signal Processing,Sound,Audio and Speech Processing
What problem does this paper attempt to address?
This paper attempts to address the problem of decoding auditory attention through Ear-EEG in multi-speaker environments (specifically, a four-speaker environment). Specifically, the study aims to verify whether Ear-EEG can capture the identity of the specific speaker that the subject is focusing on in the presence of multiple interfering sound sources. This not only helps to understand the brain's selective attention mechanisms in complex auditory environments but also provides potential applications for developing cognitive control or neuro-guided hearing devices. ### Main Research Content: 1. **Experimental Design**: Participants were asked to selectively focus on one of four spatially separated speakers in an anechoic chamber while ignoring the other three speakers. During the experiment, both scalp-EEG and ear-EEG data were collected. 2. **Methods**: - **Temporal Response Function (TRFs) Analysis**: TRFs analysis was performed using Ear-EEG data to assess the brain's tracking of target and non-target speech. - **Stimulus Reconstruction (SR)**: The SR method was used to reconstruct the speech envelope from EEG signals, further validating the decoding capability of Ear-EEG in multi-interference environments. - **Auditory Spatial Attention Detection (ASAD)**: The STAnet model was used for ASAD to evaluate its decoding performance within short time windows. 3. **Results**: - TRFs analysis showed that the TRFs response to target speech was stronger than that to non-target speech, indicating that Ear-EEG can effectively distinguish between target and non-target speech. - The SR method achieved a decoding accuracy of 41.3% within a 60-second decision window, significantly higher than the random level (25%). - The ASAD method achieved an average decoding accuracy of 93.1% within a 1-second decision window, far exceeding the SR-based method. 4. **Discussion**: - The study found that although the decoding performance of Ear-EEG in multi-interference environments is slightly lower than that of scalp-EEG, it still has high accuracy. - The position of the electrodes has a greater impact on decoding performance than the number of electrodes, suggesting that reasonable selection of electrode positions can improve decoding results. - This study provides strong support for the potential of Ear-EEG in practical applications, especially in scenarios requiring covert, non-intrusive EEG collection. ### Conclusion: This study demonstrates the feasibility of using Ear-EEG to decode auditory attention in a four-speaker environment and showcases the potential application value of Ear-EEG in complex auditory environments. Through TRFs and SR methods, the study verifies that Ear-EEG can effectively distinguish between target and non-target speech, while the ASAD method further improves decoding performance within short time windows. These results provide important references for developing more practical and accurate Ear-EEG applications.