Optimizing deep video representation to match brain activity

Hugo Richard,Ana Pinho,Bertrand Thirion,Guillaume Charpiat
DOI: https://doi.org/10.48550/arXiv.1809.02440
2018-09-07
Abstract:The comparison of observed brain activity with the statistics generated by artificial intelligence systems is useful to probe brain functional organization under ecological conditions. Here we study fMRI activity in ten subjects watching color natural movies and compute deep representations of these movies with an architecture that relies on optical flow and image content. The association of activity in visual areas with the different layers of the deep architecture displays complexity-related contrasts across visual areas and reveals a striking foveal/peripheral dichotomy.
Neural and Evolutionary Computing,Computer Vision and Pattern Recognition,Machine Learning,Neurons and Cognition
What problem does this paper attempt to address?
This paper aims to match brain activity by optimizing deep video representations. Specifically, researchers use deep neural networks to extract video features and train linear models to predict the brain activity of subjects when they are watching natural color movies. The goal of the study is to explore the functional organization of the brain's visual areas, especially the associations between different visual cortex areas and the layers of deep neural networks, and how to improve prediction performance through efficient feature compression methods. A deep neural network named Temporal Segment Network (TSN) was used in the study. This network was pre - trained on a large - scale action recognition dataset and can handle raw frames and optical flow fields to generate deep video representations. By analyzing these representations, the study revealed the contrast related to the complexity of the visual cortex, especially the significant differences between the foveal and peripheral visual areas. In addition, the study also proposed an effective spatial compression scheme for compressing deep video features. This method is significantly superior to Principal Component Analysis (PCA) in prediction performance. Through these methods, the study not only reproduced the previous research results on the functional organization of the visual cortex, but also further demonstrated the effectiveness of deep network features at different levels in predicting the activity of specific brain regions, providing a new perspective for understanding how the brain processes visual information.