Spatial encoding of BOLD fMRI time series for categorizing static images across visual datasets: A pilot study on human vision

Vamshi K. Kancharala,Debanjali Bhattacharya,Neelam Sinha

2023-09-07

Abstract:Functional MRI (fMRI) is widely used to examine brain functionality by detecting alteration in oxygenated blood flow that arises with brain activity. In this study, complexity specific image categorization across different visual datasets is performed using fMRI time series (TS) to understand differences in neuronal activities related to vision. Publicly available BOLD5000 dataset is used for this purpose, containing fMRI scans while viewing 5254 images of diverse categories, drawn from three standard computer vision datasets: COCO, ImageNet and SUN. To understand vision, it is important to study how brain functions while looking at different images. To achieve this, spatial encoding of fMRI BOLD TS has been performed that uses classical Gramian Angular Field (GAF) and Markov Transition Field (MTF) to obtain 2D BOLD TS, representing images of COCO, Imagenet and SUN. For classification, individual GAF and MTF features are fed into regular CNN. Subsequently, parallel CNN model is employed that uses combined 2D features for classifying images across COCO, Imagenet and SUN. The result of 2D CNN models is also compared with 1D LSTM and Bi-LSTM that utilizes raw fMRI BOLD signal for classification. It is seen that parallel CNN model outperforms other network models with an improvement of 7% for multi-class classification. Clinical relevance- The obtained result of this analysis establishes a baseline in studying how differently human brain functions while looking at images of diverse complexities.

Image and Video Processing,Artificial Intelligence,Computer Vision and Pattern Recognition,Signal Processing

What problem does this paper attempt to address?

The problem this paper attempts to address is how to classify static images from different visual datasets using functional magnetic resonance imaging (fMRI) time series data, in order to understand the functional differences in the brain when processing visual information of varying complexity. Specifically, the paper utilizes the publicly available BOLD5000 dataset, which contains fMRI scan data of subjects viewing 5254 images from three standard computer vision datasets (COCO, ImageNet, and SUN). The main goal of the study is to hypothesize whether the brain's functional activity differs when processing images with varying complexity, object scales, and backgrounds. To achieve this goal, the paper employs Gramian Angular Field (GAF) and Markov Transition Field (MTF) methods to encode 1D fMRI time series into 2D representations, thereby capturing spatial information. These 2D features are then fed into a Convolutional Neural Network (CNN) for image classification and compared with classification methods based on 1D LSTM and Bi-LSTM. The results of the study indicate that the classification performance using the 2D CNN model combined with GAF and MTF features outperforms other methods, particularly in multi-class classification tasks, with an accuracy improvement of 7%. This provides an important foundation for understanding the functional differences in the human brain when processing visual information of varying complexity.

Spatial encoding of BOLD fMRI time series for categorizing static images across visual datasets: A pilot study on human vision

Investigating the changes in BOLD responses during viewing of images with varied complexity: An fMRI time-series based analysis on human vision

Image complexity based fMRI-BOLD visual network categorization across visual datasets using topological descriptors and deep-hybrid learning

The scope and limits of fine-grained image and category information in the ventral visual pathway

Towards understanding the nature of direct functional connectivity in visual brain network

Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers

Cognitive state classification using transformed fMRI data

Multimodal deep neural decoding reveals highly resolved spatiotemporal profile of visual object representation in humans

EEG classification for visual brain decoding with spatio-temporal and transformer based paradigms

A New Approach for Analyzing Functional Neuroimaging Data Using a Combination of CNN-LSTM and Occlusion Sensitivity Analysis to Identify Important Brain Regions in Visual Mental Imagery

MCI Detection using fMRI time series embeddings of Recurrence plots

Convolutional neural network-based encoding and decoding of visual object recognition in space and time

Feed-forward hierarchical model of the ventral visual stream applied to functional brain image classification

Decoding fMRI data with support vector machines and deep neural networks

Category Decoding of Visual Stimuli From Human Brain Activity Using a Bidirectional Recurrent Neural Network to Simulate Bidirectional Information Flows in Human Visual Cortices

Ultra-high-resolution Fmri of Human Ventral Temporal Cortex Reveals Differential Representation of Categories and Domains.

Learning to Decode Cognitive States from Brain Images

Neural encoding and interpretation for high-level visual cortices based on fMRI using image caption features

Ultra-high-resolution fMRI reveals differential representation of categories and domains across lateral and medial ventral temporal cortex

Decoding Categories from Human Brain Activity in the Human Visual Cortex Using the Triplet Network

BOLD5000: A public fMRI dataset of 5000 images