Cross-Attention-Guided WaveNet for EEG-to-MEL Spectrogram Reconstruction

Hao Li,Yuan Fang,Xueliang Zhang,Fei Chen,Guanglai Gao
DOI: https://doi.org/10.21437/interspeech.2024-1662
2024-01-01
Abstract:This paper introduces an innovative approach that leverages a cross-attention-guided WaveNet combined with a coarse-to-fine granularity strategy to enhance the detailed reconstruction of Mel spectrograms from time-domain EEG signals. The proposed model utilizes WaveNet to sequentially reconstruct the envelope, 10-band Mel, 80-band Mel, and magnitude at progressively finer granularity levels. A cross-attention mechanism is introduced to explore correlations across modalities to address the modality gap. A combined loss function and Mixup augmentation technique are also employed to enhance the reconstruction performance. Notably, our approach achieves Pearson correlation values of 0.0651 ± 0.0153 for the validation set and 0.0413 ± 0.0169 for the heldout-subjects test set, securing the second position in the 2024 Auditory EEG Challenge. We also validated the contribution of each module through ablation experiments. The source code is available online.
What problem does this paper attempt to address?