Low-latency auditory spatial attention detection based on spectro-spatial features from EEG

Siqi Cai,Pengcheng Sun,Tanja Schultz,Haizhou Li
DOI: https://doi.org/10.48550/arXiv.2103.03621
2021-07-05
Abstract:Detecting auditory attention based on brain signals enables many everyday applications, and serves as part of the solution to the cocktail party effect in speech processing. Several studies leverage the correlation between brain signals and auditory stimuli to detect the auditory attention of listeners. Recently, studies show that the alpha band (8-13 Hz) EEG signals enable the localization of auditory stimuli. We believe that it is possible to detect auditory spatial attention without the need of auditory stimuli as references. In this work, we use alpha power signals for automatic auditory spatial attention detection. To the best of our knowledge, this is the first attempt to detect spatial attention based on alpha power neural signals. We propose a spectro-spatial feature extraction technique to detect the auditory spatial attention (left/right) based on the topographic specificity of alpha power. Experiments show that the proposed neural approach achieves 81.7% and 94.6% accuracy for 1-second and 10-second decision windows, respectively. Our comparative results show that this neural approach outperforms other competitive models by a large margin in all test cases.
Human-Computer Interaction,Sound,Audio and Speech Processing
What problem does this paper attempt to address?
This paper aims to solve the problem of how to detect auditory spatial attention based on electroencephalogram (EEG) signals. Specifically, the author proposes a new method that uses EEG signals in the α - band (8 - 13 Hz) to automatically detect auditory spatial attention (i.e., the direction the listener is paying attention to, such as left or right) without referring to auditory stimuli. This is the first attempt to detect spatial attention solely through EEG activity without a clean speech envelope as a reference signal. The main contributions of the paper include: 1. Designing and implementing a method for extracting spectral - plus - spatial features (SSF) from the EEG α - band. 2. Applying convolutional neural networks (CNN) to classify auditory spatial attention. 3. Combining the above two parts, constructing the SSF - CNN system for low - latency auditory spatial attention detection (ASAD). The experimental results show that the system has an accuracy of 81.7% in a 1 - second decision window and an accuracy as high as 94.6% in a 10 - second decision window. In addition, the SSF - CNN system significantly outperforms other competing models under various decision window lengths, especially performing more prominently in shorter decision windows. This result makes SSF - CNN a strong candidate for practical applications such as neuro - guided hearing aids.