Teaching CORnet Human fMRI Representations for Enhanced Model-Brain Alignment

Zitong Lu,Yile Wang
2024-07-15
Abstract:Deep convolutional neural networks (DCNNs) have demonstrated excellent performance in object recognition and have been found to share some similarities with brain visual processing. However, the substantial gap between DCNNs and human visual perception still exists. Functional magnetic resonance imaging (fMRI) as a widely used technique in cognitive neuroscience can record neural activation in the human visual cortex during the process of visual perception. Can we teach DCNNs human fMRI signals to achieve a more brain-like model? To answer this question, this study proposed ReAlnet-fMRI, a model based on the SOTA vision model CORnet but optimized using human fMRI data through a multi-layer encoding-based alignment framework. This framework has been shown to effectively enable the model to learn human brain representations. The fMRI-optimized ReAlnet-fMRI exhibited higher similarity to the human brain than both CORnet and the control model in within-and across-subject as well as within- and across-modality model-brain (fMRI and EEG) alignment evaluations. Additionally, we conducted an in-depth analyses to investigate how the internal representations of ReAlnet-fMRI differ from CORnet in encoding various object dimensions. These findings provide the possibility of enhancing the brain-likeness of visual models by integrating human neural data, helping to bridge the gap between computer vision and visual neuroscience.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning,Neurons and Cognition
What problem does this paper attempt to address?
The problem addressed in this paper is how to improve deep convolutional neural networks (DCNNs) to make them more similar to human visual processing by integrating human neuroimaging data. The study proposes a model called ReAlnet-fMRI, which is based on the state-of-the-art visual model CORnet and optimized through a multi-layer alignment framework using functional magnetic resonance imaging (fMRI) data for training. The paper aims to address the significant differences between DCNNs and human visual perception, and demonstrates higher similarity to human brain representations through model-brain alignment evaluations under different conditions. In addition, through in-depth analysis of internal representations, differences in encoding various object dimensions between ReAlnet-fMRI and CORnet are revealed, indicating that integrating human neuroimaging data can enhance the brain-likeness of visual models.