Abstract:Neuroimaging studies increasingly use naturalistic stimuli like video clips to trigger complex brain activations, but the complexity of such stimuli makes it difficult to assign specific functions to the resulting brain activations, particularly for higher-level content like social interactions. To address this challenge, researchers have turned to deep neural networks, e.g., convolutional neural networks (CNNs). CNNs have shown success in image recognition due to their different levels of features enabling high performance. In this study, we used pre-trained VGG-16, a popular CNN model, to analyze video data and extract hierarchical features from low-level shallow layers to high-level deeper layers, linking these activations to different levels of activation of the human brain. We hypothesized that activations in different layers of VGG-16 would be associated with different levels of brain activation and visual processing hierarchy in the brain. We were also curious about which brain regions would be associated with deeper convolutional layers in VGG-16. The study analyzed a functional MRI (fMRI) dataset where participants watched the cartoon movie Partly Cloudy. Frames of the videos were fed into VGG-16, and activation maps from different kernels and layers were extracted. Time series of the average activation patterns for each kernel were created and fed into a voxel-wise model to study brain activations. Results showed that lower convolutional layers (1 convolutional layer) were mostly associated with lower visual regions, but some kernels (6, 19, 24, 42, 55, and 58) surprisingly showed associations with activations in the posterior cingulate cortex, part of the default mode network. Deeper convolutional layers were associated with more anterior and lateral portions of the visual cortex (e.g., the lateral occipital complex) and the supramarginal gyrus. Analyzing activation features associated with different brain regions showed the promise and limitations of using CNNs to link video content to brain functions.

Optimizing deep video representation to match brain activity

NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties

Functional Brain Interactions During Free Viewing Of Video Stream

Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity

Encoding functional brain interactions from computational visual features

Encoding brain network response to free viewing of videos

One Hundred Neural Networks and Brains Watching Videos: Lessons from Alignment

Neural Representations of Dynamic Visual Stimuli

Exploring consistent functional brain networks during free viewing of videos via sparse representation

Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity

Video Abstraction Based on Fmri-Driven Visual Attention Model

Identifying Naturalistic Movies from Human Brain Activity with High-Density Diffuse Optical Tomography

Predicting whole-brain neural dynamics from prefrontal cortex fNIRS signal during movie-watching

Explorations of using a convolutional neural network to understand brain activations during movie watching

Toward naturalistic neuroscience: Mechanisms underlying the flattening of brain hierarchy in movie-watching compared to rest and task

Exploration and Comparison of Deep Learning Architectures to Predict Brain Response to Realistic Pictures

[Malunited juvenile fractures in the knee joint region].

Inferring Consistent Functional Interaction Patterns from Natural Stimulus FMRI Data

Factorized visual representations in the primate visual system and deep neural networks

Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI

Brain Decodes Deep Nets