Abstract:The process of reconstructing experiences from human brain activity offers a unique lens into how the brain interprets and represents the world. In this paper, we introduce a method for reconstructing music from brain activity, captured using functional magnetic resonance imaging (fMRI). Our approach uses either music retrieval or the MusicLM music generation model conditioned on embeddings derived from fMRI data. The generated music resembles the musical stimuli that human subjects experienced, with respect to semantic properties like genre, instrumentation, and mood. We investigate the relationship between different components of MusicLM and brain activity through a voxel-wise encoding modeling analysis. Furthermore, we discuss which brain regions represent information derived from purely textual descriptions of music stimuli. We provide supplementary material including examples of the reconstructed music at <a class="link-external link-https" href="https://google-research.github.io/seanet/brain2music" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is: how to reconstruct music from human brain activity (captured by functional magnetic resonance imaging, fMRI). Specifically, the researchers explored methods of conditioning the MusicLM music generation model using music retrieval or music embeddings generated based on fMRI data to generate music that is similar in semantic attributes (such as genre, instrument, mood) to the original music stimuli. In addition, they also studied the relationship between different components of MusicLM and brain activity, especially through voxel - level encoding modeling analysis, and discussed which brain regions represent information only from the text description of music stimuli. ### Main contributions: 1. **Music Reconstruction**: By predicting high - dimensional, semantically structured music embeddings and using deep neural networks to generate music from these features, music reconstruction from fMRI scans was achieved. Evaluations show that the reconstructed music is semantically similar to the original music stimuli. 2. **Prediction of Activity in the Brain's Auditory Cortex**: It was found that different components of the music generation model can predict the activity of the human auditory cortex. Compared with the distinction between low - level and high - level representations of visual stimuli in the visual cortex, this distinction in the auditory cortex is less obvious. 3. **Overlapping Prediction in the Auditory Cortex**: It provides new insights, indicating that there is a significant overlap of voxels predicted from music described by pure text and the music itself in the auditory cortex. ### Method Overview: - **Dataset**: The neuroimaging dataset of music genres by Nakai et al. (2022) was used, which contains 540 music segments of 10 genres. - **Model**: The MuLan joint text/music embedding model and the MusicLM conditional music generation model were utilized. The MuLan model maps music and text to a 128 - dimensional embedding space, and MusicLM generates music based on these embeddings. - **Decoding Process**: The fMRI response was mapped to MuLan embeddings through linear regression, and then MusicLM was used to generate music. At the same time, methods of retrieving similar music from existing music libraries were also explored. - **Evaluation Metrics**: Recognition accuracy and the top - n consistency rate of AudioSet categories were used to evaluate the quality of the reconstructed music. ### Results: - **Music Embedding Prediction**: MuLan music embeddings can be more accurately predicted from fMRI signals than other types of embeddings (such as MuLan text embeddings, w2v - BERT average embeddings, SoundStream average embeddings). - **Qualitative Reconstruction Results**: The music retrieved by FMA and generated by MusicLM is semantically similar to the original stimuli, but the temporal structure often cannot be fully preserved. - **Quantitative Reconstruction Evaluation**: Significantly higher - than - random performance was observed on all metrics, supporting the feasibility of reconstructing music from fMRI data. In conclusion, this paper demonstrates the initial success of reconstructing music from human brain activity, providing a new perspective for understanding how the brain processes and represents music.

Brain2Music: Reconstructing Music from Human Brain Activity

R&B -- Rhythm and Brain: Cross-subject Decoding of Music from Human Brain Activity

Music Imagery for Brain-Computer Interface Control.

The Study of Brain's Musical Function

Music can be reconstructed from human auditory cortex activity using nonlinear decoding models

Functional Neuroimaging of Stimulation by Music Using Positron Emission Tomography

Exploring Brain Dynamics via EEG and Steady-State Activation Map Networks in Music Composition

Spatiotemporal whole-brain activity and functional connectivity of melodies recognition

EEG2Mel: Reconstructing Sound from Brain Responses to Music

Generate the scale-free brain music from BOLD signals

Neural Correlates of Music Listening and Recall in the Human Brain.

Sound reconstruction from human brain activity via a generative model with brain-like auditory features

Music-Experience-Related and Musical-Error-Dependent Activations in the Brain

Linking Brain Responses to Naturalistic Music Through Analysis of Ongoing EEG and Stimulus Features

Deriving Electrophysiological Brain Network Connectivity Via Tensor Component Analysis During Freely Listening to Music

Exploring Frequency-Dependent Brain Networks from Ongoing EEG Using Spatial ICA During Music Listening

Scale-free Brain-Wave Music from Simultaneously EEG and Fmri Recordings.

Scale-Free Music Of The Brain

Music Composition from the Brain Signal: Representing the Mental State by Music

Naturalistic Music Decoding from EEG Data via Latent Diffusion Models

The Neural Mechanism Underlying Music Perception: A Meta-analysis of Fmri Studies