Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex

Spandan Madan,Will Xiao,Mingran Cao,Hanspeter Pfister,Margaret Livingstone,Gabriel Kreiman

2024-06-17

Abstract:We characterized the generalization capabilities of DNN-based encoding models when predicting neuronal responses from the visual cortex. We collected \textit{MacaqueITBench}, a large-scale dataset of neural population responses from the macaque inferior temporal (IT) cortex to over $300,000$ images, comprising $8,233$ unique natural images presented to seven monkeys over $109$ sessions. Using \textit{MacaqueITBench}, we investigated the impact of distribution shifts on models predicting neural activity by dividing the images into Out-Of-Distribution (OOD) train and test splits. The OOD splits included several different image-computable types including image contrast, hue, intensity, temperature, and saturation. Compared to the performance on in-distribution test images -- the conventional way these models have been evaluated -- models performed worse at predicting neuronal responses to out-of-distribution images, retaining as little as $20\%$ of the performance on in-distribution test images. The generalization performance under OOD shifts can be well accounted by a simple image similarity metric -- the cosine distance between image representations extracted from a pre-trained object recognition model is a strong predictor of neural predictivity under different distribution shifts. The dataset of images, neuronal firing rate recordings, and computational benchmarks are hosted publicly at: <a class="link-external link-https" href="https://bit.ly/3zeutVd" rel="external noopener nofollow">this https URL</a>.

Signal Processing,Artificial Intelligence

What problem does this paper attempt to address?

This paper mainly explores the generalization ability of deep neural networks (DNN) in predicting neural responses in the macaque ventral visual cortex, especially when facing out-of-distribution (OOD) images. The research team created a large-scale dataset called Macaque-ITBench, which contains neural responses to over 300,000 natural images presented to seven monkeys across 109 sessions. They studied the impact of distribution shifts on the model's prediction accuracy through different OOD training and testing splits. The results showed that the model's performance significantly declined when dealing with OOD images compared to in-distribution images, with a maximum reduction of up to 20% from the original performance. The paper indicates that this decrease in generalization performance can be explained by a simple measure of image similarity - the cosine distance between image representations extracted by a pre-trained object recognition model. This measure can effectively predict neural predictions under different distribution shifts. Furthermore, the paper discusses the insufficient generalization ability of DNNs in handling OOD data, which may also impact DNN-based visual cortex models. The researchers propose a hypothesis that even though DNN models can predict neural responses well under random splits in the image set, specific training-testing splits can lead to performance degradation proportional to the size of distribution shift. The main contributions of the paper include: 1. Providing a large-scale dataset of neural responses in the macaque ventral visual pathway called Macaque-ITBench. 2. Demonstrating poor generalization ability of modern visual cortex models on out-of-distribution images. 3. Showing that a simple measure of distribution shift size can predict neural predictivity for out-of-distribution images. The limitations of the study include only focusing on the baseline DNN model and not exploring other methods such as modifications in model architecture or loss functions, as well as data augmentation strategies that could potentially enhance generalization ability. Future research needs to address the generalization ability of models in handling OOD data to improve their applicability and reliability in neuroscience.

Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex

How well do models of visual cortex generalize to out of distribution samples?

NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework

The Neural Representation Benchmark and its Evaluation on Brain and Machine

Determinantal Point Process Attention Over Grid Cell Code Supports Out of Distribution Generalization

Seeing eye-to-eye? A comparison of object recognition performance in humans and deep convolutional neural networks under image manipulation

OoD-Bench: Quantifying and Understanding Two Dimensions of Out-of-Distribution Generalization

Generalizability analysis of deep learning predictions of human brain responses to augmented and semantically novel visual stimuli

A large-scale examination of inductive biases shaping high-level visual representation in brains and machines

Partial success in closing the gap between human and machine vision

Neuron Activation Coverage: Rethinking Out-of-distribution Detection and Generalization

Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

Iterative VAE as a predictive brain model for out-of-distribution generalization

Modeling the Human Visual System: Comparative Insights from Response-Optimized and Task-Optimized Vision Models, Language Models, and different Readout Mechanisms

ROOD-MRI: Benchmarking the robustness of deep learning segmentation models to out-of-distribution and corrupted data in MRI

OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

OOD-CV-v2 : An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

Inferring Population Dynamics in Macaque Cortex

Retrospective for the Dynamic Sensorium Competition for predicting large-scale mouse primary visual cortex activity from videos

Visual DNA: Representing and Comparing Images using Distributions of Neuron Activations

Scaling Laws for Task-Optimized Models of the Primate Visual Ventral Stream