AudioDiffusion: Generating High-Quality Audios from EEG Signals : Reconstructing Audio from EEG Signals

Ling Kong,Dianyuan Qi,Congsheng Li,Lei Yang
DOI: https://doi.org/10.1109/ISCEIC59030.2023.10271237
2023-08-18
Abstract:The study proposed a new model for generating high-quality audio directly from the brain’s electroencephalogram (EEG) signal. AudioDiffusion uses pre-trained text-to-speech models with temporally masked signal modelling to pre-train the EEG encoder for effective and robust EEG representations and robust EEG representation. In addition, the method further utilizes the Mel-Frequency Spectrum encoder to provide additional supervision in order to better align EEG and speech embeddings in a limited number of EEG-audio pairs. Overall, the proposed method overcomes the challenges of using EEG signals to generate audio, such as noise, limited information and individual differences, and achieves promising results. The quantitative and qualitative results demonstrate the effectiveness of the proposed method, an important step towards portable and low-cost EEG-to-audio, with potential applications in neuroscience and natural language processing.
Medicine,Computer Science
What problem does this paper attempt to address?