Abstract:Naturalistic stimuli, including movie, music, and speech, have been increasingly applied in the research of neuroimaging. Relative to a resting-state or single-task state, naturalistic stimuli can evoke more intense brain activities and have been proved to possess higher test-retest reliability, suggesting greater potential to study adaptive human brain function. In the current research, naturalistic functional magnetic resonance imaging (N-fMRI) has been a powerful tool to record brain states under naturalistic stimuli, and many efforts have been devoted to study the high-level semantic features from spatial or temporal representations via N-fMRI. However, integrating both spatial and temporal characteristics of brain activities for better interpreting the patterns under naturalistic stimuli is still underexplored. In this work, a novel hybrid learning framework that comprehensively investigates both the spatial (via Predictive Model) and the temporal [via convolutional neural network (CNN) model] characteristics of the brain is proposed. Specifically, to focus on certain relevant regions from the whole brain, regions of significance (ROS), which contain common spatial activation characteristics across individuals, are selected via the Predictive Model. Further, voxels of significance (VOS), whose signals contain significant temporal characteristics under naturalistic stimuli, are interpreted via one-dimensional CNN (1D-CNN) model. In this article, our proposed framework is applied onto the N-fMRI data during naturalistic classical/pop/speech audios stimuli. The promising performance is achieved via the Predictive Model to differentiate the different audio categories. Especially for distinguishing the classic and speech audios, the accuracy of classification is up to 92%. Moreover, spatial ROS and VOS are effectively obtained. Besides, temporal characteristics of the high-level semantic features are investigated on the frequency domain via convolution kernels of 1D-CNN model, and we effectively bridge the "semantic gap" between high-level semantic features of N-fMRI and low-level acoustic features of naturalistic audios in the frequency domain. Our results provide novel insights on characterizing spatiotemporal patterns of brain activities via N-fMRI and effectively explore the high-level semantic features under naturalistic stimuli, which will further benefit the understanding of the brain working mechanism and the advance of naturalistic stimuli clinical application.

Decoding Auditory Saliency From Fmri Brain Imaging

Decoding Auditory Saliency from Brain Activity Patterns During Free Listening to Naturalistic Audio Excerpts

Decoding auditory attention (in real time) with eeg

Exploring Auditory Network Composition During Free Listening to Audio Excerpts Via Group-Wise Sparse Representation.

Functional Brain Networks Underlying Auditory Saliency During Naturalistic Listening Experience

Music/speech Classification Using High-Level Features Derived from Fmri Brain Imaging.

Reverse the auditory processing pathway: Coarse-to-fine audio reconstruction from fMRI

Sparsity-Constrained fMRI Decoding of Visual Saliency in Naturalistic Video Streams

Decoding power-spectral profiles from FMRI brain activities during naturalistic auditory experience

A hybrid learning framework for fine-grained interpretation of brain spatiotemporal patterns during naturalistic functional magnetic resonance imaging

Decoding Dynamic Auditory Attention During Naturalistic Experience.

Analysis of music/speech via integration of audio content and functional brain response.

Decoding Visual Experience and Mapping Semantics through Whole-Brain Analysis Using fMRI Foundation Models

A Hybrid Spatio-Temporal Deep Belief Network and Sparse Representation-Based Framework Reveals Multi-Level Core Functional Components in Decoding Multi-Task fMRI Signals

Bridging Low-Level Features and High-Level Semantics Via Fmri Brain Imaging for Video Classification

Bridging the Semantic Gap via Functional Brain Imaging

Brain Dialogue Interface (BDI): A User-Friendly fMRI Model for Interactive Brain Decoding

Data-driven analysis of functional brain interactions during free listening to music and speech

Decoding Continuous Character-based Language from Non-invasive Brain Recordings

fMRI-based Decoding of Visual Information from Human Brain Activity: A Brief Review