An improved fluorimetric assay for brain monoamine oxidase.

A. Morinan,H. Garratt

DOI: https://doi.org/10.1016/0160-5402(85)90021-X

1985-06-01

Journal of Pharmacological Methods

Abstract:

What problem does this paper attempt to address?

Seeing eye-to-eye? A comparison of object recognition performance in humans and deep convolutional neural networks under image manipulation

Leonard E. van Dyck,Walter R. Gruber

DOI: https://doi.org/10.48550/arXiv.2007.06294

2020-12-13

Abstract:For a considerable time, deep convolutional neural networks (DCNNs) have reached human benchmark performance in object recognition. On that account, computational neuroscience and the field of machine learning have started to attribute numerous similarities and differences to artificial and biological vision. This study aims towards a behavioral comparison of visual core object recognition performance between humans and feedforward neural networks in a classification learning paradigm on an ImageNet data set. For this purpose, human participants (n = 65) competed in an online experiment against different feedforward DCNNs. The designed approach based on a typical learning process of seven different monkey categories included a training and validation phase with natural examples, as well as a testing phase with novel, unexperienced shape and color manipulations. Analyses of accuracy revealed that humans not only outperform DCNNs on all conditions, but also display significantly greater robustness towards shape and most notably color alterations. Furthermore, a precise examination of behavioral patterns highlights these findings by revealing independent classification errors between the groups. The obtained results show that humans contrast strongly with artificial feedforward architectures when it comes to visual core object recognition of manipulated images. In general, these findings are in line with a growing body of literature, that hints towards recurrence as a crucial factor for adequate generalization abilities.

Computer Vision and Pattern Recognition,Machine Learning,Image and Video Processing,Neurons and Cognition
Humans and deep networks largely agree on which kinds of variation make object recognition harder

Saeed Reza Kheradpisheh,Masoud Ghodrati,Mohammad Ganjtabesh,Timothée Masquelier

DOI: https://doi.org/10.3389/fncom.2016.00092

2016-04-22

Abstract:View-invariant object recognition is a challenging problem, which has attracted much attention among the psychology, neuroscience, and computer vision communities. Humans are notoriously good at it, even if some variations are presumably more difficult to handle than others (e.g. 3D rotations). Humans are thought to solve the problem through hierarchical processing along the ventral stream, which progressively extracts more and more invariant visual features. This feed-forward architecture has inspired a new generation of bio-inspired computer vision systems called deep convolutional neural networks (DCNN), which are currently the best algorithms for object recognition in natural images. Here, for the first time, we systematically compared human feed-forward vision and DCNNs at view-invariant object recognition using the same images and controlling for both the kinds of transformation as well as their magnitude. We used four object categories and images were rendered from 3D computer models. In total, 89 human subjects participated in 10 experiments in which they had to discriminate between two or four categories after rapid presentation with backward masking. We also tested two recent DCNNs on the same tasks. We found that humans and DCNNs largely agreed on the relative difficulties of each kind of variation: rotation in depth is by far the hardest transformation to handle, followed by scale, then rotation in plane, and finally position. This suggests that humans recognize objects mainly through 2D template matching, rather than by constructing 3D object models, and that DCNNs are not too unreasonable models of human feed-forward vision. Also, our results show that the variation levels in rotation in depth and scale strongly modulate both humans' and DCNNs' recognition performances. We thus argue that these variations should be controlled in the image datasets used in vision research.

Computer Vision and Pattern Recognition,Neurons and Cognition
A large-scale examination of inductive biases shaping high-level visual representation in brains and machines

Colin Conwell,Jacob S. Prince,Kendrick N. Kay,George A. Alvarez,Talia Konkle

DOI: https://doi.org/10.1038/s41467-024-53147-y

IF: 16.6

2024-10-31

Nature Communications

Abstract:The rapid release of high-performing computer vision models offers new potential to study the impact of different inductive biases on the emergent brain alignment of learned representations. Here, we perform controlled comparisons among a curated set of 224 diverse models to test the impact of specific model properties on visual brain predictivity – a process requiring over 1.8 billion regressions and 50.3 thousand representational similarity analyses. We find that models with qualitatively different architectures (e.g. CNNs versus Transformers) and task objectives (e.g. purely visual contrastive learning versus vision- language alignment) achieve near equivalent brain predictivity, when other factors are held constant. Instead, variation across visual training diets yields the largest, most consistent effect on brain predictivity. Many models achieve similarly high brain predictivity, despite clear variation in their underlying representations – suggesting that standard methods used to link models to brains may be too flexible. Broadly, these findings challenge common assumptions about the factors underlying emergent brain alignment, and outline how we can leverage controlled model comparison to probe the common computational principles underlying biological and artificial visual systems.

multidisciplinary sciences
Evaluating (and Improving) the Correspondence Between Deep Neural Networks and Human Representations

Joshua C. Peterson,Joshua T. Abbott,Thomas L. Griffiths

DOI: https://doi.org/10.1111/cogs.12670

IF: 2.617

2018-09-03

Cognitive Science

Abstract:Decades of psychological research have been aimed at modeling how people learn features and categories. The empirical validation of these theories is often based on artificial stimuli with simple representations. Recently, deep neural networks have reached or surpassed human accuracy on tasks such as identifying objects in natural images. These networks learn representations of real-world stimuli that can potentially be leveraged to capture psychological representations. We find that state-of-the-art object classification networks provide surprisingly accurate predictions of human similarity judgments for natural images, but they fail to capture some of the structure represented by people. We show that a simple transformation that corrects these discrepancies can be obtained through convex optimization. We use the resulting representations to predict the difficulty of learning novel categories of natural images. Our results extend the scope of psychological experiments and computational modeling by enabling tractable use of large natural stimulus sets.

psychology, experimental
Dimensions underlying the representational alignment of deep neural networks with humans

Florian P. Mahner,Lukas Muttenthaler,Umut Güçlü,Martin N. Hebart

2024-06-27

Abstract:Determining the similarities and differences between humans and artificial intelligence is an important goal both in machine learning and cognitive neuroscience. However, similarities in representations only inform us about the degree of alignment, not the factors that determine it. Drawing upon recent developments in cognitive science, we propose a generic framework for yielding comparable representations in humans and deep neural networks (DNN). Applying this framework to humans and a DNN model of natural images revealed a low-dimensional DNN embedding of both visual and semantic dimensions. In contrast to humans, DNNs exhibited a clear dominance of visual over semantic features, indicating divergent strategies for representing images. While in-silico experiments showed seemingly-consistent interpretability of DNN dimensions, a direct comparison between human and DNN representations revealed substantial differences in how they process images. By making representations directly comparable, our results reveal important challenges for representational alignment, offering a means for improving their comparability.

Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning,Quantitative Methods
Improved object recognition using neural networks trained to mimic the brain's statistical properties

Callie Federer,Haoyan Xu,Alona Fyshe,Joel Zylberberg

DOI: https://doi.org/10.48550/arXiv.1905.10679

2020-07-16

Abstract:The current state-of-the-art object recognition algorithms, deep convolutional neural networks (DCNNs), are inspired by the architecture of the mammalian visual system, and are capable of human-level performance on many tasks. However, even these algorithms make errors. As they are trained for object recognition tasks, it has been shown that DCNNs develop hidden representations that resemble those observed in the mammalian visual system. Moreover, DCNNs trained on object recognition tasks are currently among the best models we have of the mammalian visual system. This led us to hypothesize that teaching DCNNs to achieve even more brain-like representations could improve their performance. To test this, we trained DCNNs on a composite task, wherein networks were trained to: a) classify images of objects; while b) having intermediate representations that resemble those observed in neural recordings from monkey visual cortex. Compared with DCNNs trained purely for object categorization, DCNNs trained on the composite task had better object recognition performance and are more robust to label corruption. Interestingly, we also found that neural data was not required, but randomized data with the same statistics as neural data also boosted performance. Our results outline a new way to train object recognition networks, using strategies in which the brain - or at least the statistical properties of its activation patterns - serves as a teacher signal for training DCNNs.

Computer Vision and Pattern Recognition
How well do models of visual cortex generalize to out of distribution samples?

Yifei Ren,Pouya Bashivan

DOI: https://doi.org/10.1371/journal.pcbi.1011145

2024-06-01

PLoS Computational Biology

Abstract:Unit activity in particular deep neural networks (DNNs) are remarkably similar to the neuronal population responses to static images along the primate ventral visual cortex. Linear combinations of DNN unit activities are widely used to build predictive models of neuronal activity in the visual cortex. Nevertheless, prediction performance in these models is often investigated on stimulus sets consisting of everyday objects under naturalistic settings. Recent work has revealed a generalization gap in how predicting neuronal responses to synthetically generated out-of-distribution (OOD) stimuli. Here, we investigated how the recent progress in improving DNNs' object recognition generalization, as well as various DNN design choices such as architecture, learning algorithm, and datasets have impacted the generalization gap in neural predictivity. We came to a surprising conclusion that the performance on none of the common computer vision OOD object recognition benchmarks is predictive of OOD neural predictivity performance. Furthermore, we found that adversarially robust models often yield substantially higher generalization in neural predictivity, although the degree of robustness itself was not predictive of neural predictivity score. These results suggest that improving object recognition behavior on current benchmarks alone may not lead to more general models of neurons in the primate ventral visual cortex. Inspired by the neural circuits of the brain, deep neural networks (DNN) have been steadily improving in their ability to perform foundational visual tasks such as object recognition. Whereas, early models struggled with generalization to abstract visual domains such as line drawings and cartoons, recent advancement have approached near-human recognition capabilities. Moreover, the unit activity in these networks exhibit strong similarities with the activity of single-unit recordings along the primate ventral visual cortex. This capability of DNNs has provided visual neuroscientists with precise models for exploring the neural underpinnings of object recognition. Our research probes whether enhancements in neural networks' recognition of out-of-distribution objects correlate with improved predictability of brain activity in the visual cortex of monkeys to synthetic stimuli. We found that the out of distribution object recognition performance on natural image datasets is not a reliable measure of neural predictivity. However, DNN models that were trained to be more resilient to adversarially generated noise patterns as well as DNN ensembles, consistently yielded better generalization in neural predictivity. Altogether, our results suggest that improving object recognition behaviour on current benchmarks alone may not lead to more general models of neurons in the primate ventral visual cortex.

biochemical research methods,mathematical & computational biology
Differentiable Optimization of Similarity Scores Between Models and Brains

Nathan Cloos,Moufan Li,Markus Siegel,Scott L. Brincat,Earl K. Miller,Guangyu Robert Yang,Christopher J. Cueva

2024-10-22

Abstract:How do we know if two systems - biological or artificial - process information in a similar way? Similarity measures such as linear regression, Centered Kernel Alignment (CKA), Normalized Bures Similarity (NBS), and angular Procrustes distance, are often used to quantify this similarity. However, it is currently unclear what drives high similarity scores and even what constitutes a "good" score. Here, we introduce a novel tool to investigate these questions by differentiating through similarity measures to directly maximize the score. Surprisingly, we find that high similarity scores do not guarantee encoding task-relevant information in a manner consistent with neural data; and this is particularly acute for CKA and even some variations of cross-validated and regularized linear regression. We find no consistent threshold for a good similarity score - it depends on both the measure and the dataset. In addition, synthetic datasets optimized to maximize similarity scores initially learn the highest variance principal component of the target dataset, but some methods like angular Procrustes capture lower variance dimensions much earlier than methods like CKA. To shed light on this, we mathematically derive the sensitivity of CKA, angular Procrustes, and NBS to the variance of principal component dimensions, and explain the emphasis CKA places on high variance components. Finally, by jointly optimizing multiple similarity measures, we characterize their allowable ranges and reveal that some similarity measures are more constraining than others. While current measures offer a seemingly straightforward way to quantify the similarity between neural systems, our work underscores the need for careful interpretation. We hope the tools we developed will be used by practitioners to better understand current and future similarity measures.

Neurons and Cognition,Machine Learning
Evaluating Representational Similarity Measures from the Lens of Functional Correspondence

Yiqing Bo,Ansh Soni,Sudhanshu Srivastava,Meenakshi Khosla

2024-11-22

Abstract:Neuroscience and artificial intelligence (AI) both face the challenge of interpreting high-dimensional neural data, where the comparative analysis of such data is crucial for revealing shared mechanisms and differences between these complex systems. Despite the widespread use of representational comparisons and the abundance classes of comparison methods, a critical question remains: which metrics are most suitable for these comparisons? While some studies evaluate metrics based on their ability to differentiate models of different origins or constructions (e.g., various architectures), another approach is to assess how well they distinguish models that exhibit distinct behaviors. To investigate this, we examine the degree of alignment between various representational similarity measures and behavioral outcomes, employing group statistics and a comprehensive suite of behavioral metrics for comparison. In our evaluation of eight commonly used representational similarity metrics in the visual domain -- spanning alignment-based, Canonical Correlation Analysis (CCA)-based, inner product kernel-based, and nearest-neighbor methods -- we found that metrics like linear Centered Kernel Alignment (CKA) and Procrustes distance, which emphasize the overall geometric structure or shape of representations, excelled in differentiating trained from untrained models and aligning with behavioral measures, whereas metrics such as linear predictivity, commonly used in neuroscience, demonstrated only moderate alignment with behavior. These insights are crucial for selecting metrics that emphasize behaviorally meaningful comparisons in NeuroAI research.

Neurons and Cognition,Artificial Intelligence,Computer Vision and Pattern Recognition
Using drawings and deep neural networks to characterize the building blocks of human visual similarity

Kushin Mukherjee,Timothy T. Rogers

DOI: https://doi.org/10.3758/s13421-024-01580-1

2024-06-01

Memory & Cognition

Abstract:Early in life and without special training, human beings discern resemblance between abstract visual stimuli, such as drawings, and the real-world objects they represent. We used this capacity for visual abstraction as a tool for evaluating deep neural networks (DNNs) as models of human visual perception. Contrasting five contemporary DNNs, we evaluated how well each explains human similarity judgments among line drawings of recognizable and novel objects. For object sketches, human judgments were dominated by semantic category information; DNN representations contributed little additional information. In contrast, such features explained significant unique variance perceived similarity of abstract drawings. In both cases, a vision transformer trained to blend representations of images and their natural language descriptions showed the greatest ability to explain human perceptual similarity—an observation consistent with contemporary views of semantic representation and processing in the human mind and brain. Together, the results suggest that the building blocks of visual similarity may arise within systems that learn to use visual information, not for specific classification, but in service of generating semantic representations of objects.

psychology, experimental
Divergences in color perception between deep neural networks and humans

Ethan O Nadler,Elise Darragh-Ford,Bhargav Srinivasa Desikan,Christian Conaway,Mark Chu,Tasker Hull,Douglas Guilbeault

DOI: https://doi.org/10.1016/j.cognition.2023.105621

IF: 4.011

Cognition

Abstract:Deep neural networks (DNNs) are increasingly proposed as models of human vision, bolstered by their impressive performance on image classification and object recognition tasks. Yet, the extent to which DNNs capture fundamental aspects of human vision such as color perception remains unclear. Here, we develop novel experiments for evaluating the perceptual coherence of color embeddings in DNNs, and we assess how well these algorithms predict human color similarity judgments collected via an online survey. We find that state-of-the-art DNN architectures - including convolutional neural networks and vision transformers - provide color similarity judgments that strikingly diverge from human color judgments of (i) images with controlled color properties, (ii) images generated from online searches, and (iii) real-world images from the canonical CIFAR-10 dataset. We compare DNN performance against an interpretable and cognitively plausible model of color perception based on wavelet decomposition, inspired by foundational theories in computational neuroscience. While one deep learning model - a convolutional DNN trained on a style transfer task - captures some aspects of human color perception, our wavelet algorithm provides more coherent color embeddings that better predict human color judgments compared to all DNNs we examine. These results hold when altering the high-level visual task used to train similar DNN architectures (e.g., image classification versus image segmentation), as well as when examining the color embeddings of different layers in a given DNN architecture. These findings break new ground in the effort to analyze the perceptual representations of machine learning algorithms and to improve their ability to serve as cognitively plausible models of human vision. Implications for machine learning, human perception, and embodied cognition are discussed.
Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

Charles F. Cadieu,Ha Hong,Daniel L. K. Yamins,Nicolas Pinto,Diego Ardila,Ethan A. Solomon,Najib J. Majaj,James J. DiCarlo

DOI: https://doi.org/10.1371/journal.pcbi.1003963

2014-06-13

Abstract:The primate visual system achieves remarkable visual object recognition performance even in brief presentations and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations such as the amount of noise, the number of neural recording sites, and the number trials, and computational limitations such as the complexity of the decoding classifier and the number of classifier training examples. In this work we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of "kernel analysis" that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds.

Neurons and Cognition,Neural and Evolutionary Computing
Comparing object recognition in humans and deep convolutional neural networks -- An eye tracking study

Leonard E. van Dyck,Roland Kwitt,Sebastian J. Denzler,Walter R. Gruber

DOI: https://doi.org/10.48550/arXiv.2108.00107

2021-09-21

Abstract:Deep convolutional neural networks (DCNNs) and the ventral visual pathway share vast architectural and functional similarities in visual challenges such as object recognition. Recent insights have demonstrated that both hierarchical cascades can be compared in terms of both exerted behavior and underlying activation. However, these approaches ignore key differences in spatial priorities of information processing. In this proof-of-concept study, we demonstrate a comparison of human observers (N = 45) and three feedforward DCNNs through eye tracking and saliency maps. The results reveal fundamentally different resolutions in both visualization methods that need to be considered for an insightful comparison. Moreover, we provide evidence that a DCNN with biologically plausible receptive field sizes called vNet reveals higher agreement with human viewing behavior as contrasted with a standard ResNet architecture. We find that image-specific factors such as category, animacy, arousal, and valence have a direct link to the agreement of spatial object recognition priorities in humans and DCNNs, while other measures such as difficulty and general image properties do not. With this approach, we try to open up new perspectives at the intersection of biological and computer vision research.

Computer Vision and Pattern Recognition
Aligning Machine and Human Visual Representations across Abstraction Levels

Lukas Muttenthaler,Klaus Greff,Frieda Born,Bernhard Spitzer,Simon Kornblith,Michael C. Mozer,Klaus-Robert Müller,Thomas Unterthiner,Andrew K. Lampinen

2024-10-29

Abstract:Deep neural networks have achieved success across a wide range of applications, including as models of human behavior in vision tasks. However, neural network training and human learning differ in fundamental ways, and neural networks often fail to generalize as robustly as humans do, raising questions regarding the similarity of their underlying representations. What is missing for modern learning systems to exhibit more human-like behavior? We highlight a key misalignment between vision models and humans: whereas human conceptual knowledge is hierarchically organized from fine- to coarse-scale distinctions, model representations do not accurately capture all these levels of abstraction. To address this misalignment, we first train a teacher model to imitate human judgments, then transfer human-like structure from its representations into pretrained state-of-the-art vision foundation models. These human-aligned models more accurately approximate human behavior and uncertainty across a wide range of similarity tasks, including a new dataset of human judgments spanning multiple levels of semantic abstractions. They also perform better on a diverse set of machine learning tasks, increasing generalization and out-of-distribution robustness. Thus, infusing neural networks with additional human knowledge yields a best-of-both-worlds representation that is both more consistent with human cognition and more practically useful, thus paving the way toward more robust, interpretable, and human-like artificial intelligence systems.

Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
Deep Neural Networks and Visuo-Semantic Models Explain Complementary Components of Human Ventral-Stream Representational Dynamics

Kamila M Jozwik,Tim C Kietzmann,Radoslaw M Cichy,Nikolaus Kriegeskorte,Marieke Mur

DOI: https://doi.org/10.1523/JNEUROSCI.1424-22.2022

2023-03-08

Abstract:Deep neural networks (DNNs) are promising models of the cortical computations supporting human object recognition. However, despite their ability to explain a significant portion of variance in neural data, the agreement between models and brain representational dynamics is far from perfect. We address this issue by asking which representational features are currently unaccounted for in neural time series data, estimated for multiple areas of the ventral stream via source-reconstructed magnetoencephalography data acquired in human participants (nine females, six males) during object viewing. We focus on the ability of visuo-semantic models, consisting of human-generated labels of object features and categories, to explain variance beyond the explanatory power of DNNs alone. We report a gradual reversal in the relative importance of DNN versus visuo-semantic features as ventral-stream object representations unfold over space and time. Although lower-level visual areas are better explained by DNN features starting early in time (at 66 ms after stimulus onset), higher-level cortical dynamics are best accounted for by visuo-semantic features starting later in time (at 146 ms after stimulus onset). Among the visuo-semantic features, object parts and basic categories drive the advantage over DNNs. These results show that a significant component of the variance unexplained by DNNs in higher-level cortical dynamics is structured and can be explained by readily nameable aspects of the objects. We conclude that current DNNs fail to fully capture dynamic representations in higher-level human visual cortex and suggest a path toward more accurate models of ventral-stream computations.SIGNIFICANCE STATEMENT When we view objects such as faces and cars in our visual environment, their neural representations dynamically unfold over time at a millisecond scale. These dynamics reflect the cortical computations that support fast and robust object recognition. DNNs have emerged as a promising framework for modeling these computations but cannot yet fully account for the neural dynamics. Using magnetoencephalography data acquired in human observers during object viewing, we show that readily nameable aspects of objects, such as 'eye', 'wheel', and 'face', can account for variance in the neural dynamics over and above DNNs. These findings suggest that DNNs and humans may in part rely on different object features for visual recognition and provide guidelines for model improvement.
Variation in the geometry of concept manifolds across human visual cortex

Ghislain St-Yves,Kendrick Kay,Thomas Naselaris

DOI: https://doi.org/10.1101/2024.11.26.625280

2024-11-26

Abstract:Brain activity patterns in high-level visual cortex support accurate linear classification of visual concepts (e.g., objects or scenes). It has long been appreciated that the accuracy of linear classification in any brain area depends on the geometry of its concept manifolds—sets of brain activity patterns that encode images of a concept. However, it is unclear how the geometry of concept manifolds differs between regions of visual cortex that support accurate classification and those that don’t, or how it differs between visual cortex and deep neural networks (DNNs). We estimated geometric properties of concept manifolds that, per a recent theory, directly determine the accuracy of simple “few-shot” linear classifiers. Using a large fMRI dataset, we show that variation in classification accuracy across human visual cortex is driven by a variation in a single geometric property: the distance between manifold centers (“geometric Signal”). In contrast, variation in classification accuracy across most DNN layers is driven by an increase in the effective number of manifold dimensions (“Dimensionality”). Despite this difference in the geometric properties that affect few-shot classification performance in the brain and DNNs, we find that Signal and Dimensionality are strongly, negatively correlated: when Signal increases across brain regions or DNN layers, Dimensionality decreases, and vice versa. We conclude that visual cortex and DNNs deploy different geometric strategies for accurate linear classification of concepts, even though both are subject to the same constraint.

Neuroscience
Leveraging the Human Ventral Visual Stream to Improve Neural Network Robustness

Zhenan Shao,Linjian Ma,Bo Li,Diane M. Beck

2024-05-04

Abstract:Human object recognition exhibits remarkable resilience in cluttered and dynamic visual environments. In contrast, despite their unparalleled performance across numerous visual tasks, Deep Neural Networks (DNNs) remain far less robust than humans, showing, for example, a surprising susceptibility to adversarial attacks involving image perturbations that are (almost) imperceptible to humans. Human object recognition likely owes its robustness, in part, to the increasingly resilient representations that emerge along the hierarchy of the ventral visual cortex. Here we show that DNNs, when guided by neural representations from a hierarchical sequence of regions in the human ventral visual stream, display increasing robustness to adversarial attacks. These neural-guided models also exhibit a gradual shift towards more human-like decision-making patterns and develop hierarchically smoother decision surfaces. Importantly, the resulting representational spaces differ in important ways from those produced by conventional smoothing methods, suggesting that such neural-guidance may provide previously unexplored robustness solutions. Our findings support the gradual emergence of human robustness along the ventral visual hierarchy and suggest that the key to DNN robustness may lie in increasing emulation of the human brain.

Computer Vision and Pattern Recognition,Artificial Intelligence,Neurons and Cognition
Dissociable Neural Representations of Adversarially Perturbed Images in Convolutional Neural Networks and the Human Brain

Chi Zhang,Xiao-Han Duan,Lin-Yuan Wang,Yong-Li Li,Bin Yan,Guo-En Hu,Ru-Yuan Zhang,Li Tong

DOI: https://doi.org/10.3389/fninf.2021.677925

2021-08-05

Frontiers in Neuroinformatics

Abstract:Despite the remarkable similarities between convolutional neural networks (CNN) and the human brain, CNNs still fall behind humans in many visual tasks, indicating that there still exist considerable differences between the two systems. Here, we leverage adversarial noise (AN) and adversarial interference (AI) images to quantify the consistency between neural representations and perceptual outcomes in the two systems. Humans can successfully recognize AI images as the same categories as their corresponding regular images but perceive AN images as meaningless noise. In contrast, CNNs can recognize AN images similar as corresponding regular images but classify AI images into wrong categories with surprisingly high confidence. We use functional magnetic resonance imaging to measure brain activity evoked by regular and adversarial images in the human brain, and compare it to the activity of artificial neurons in a prototypical CNN—AlexNet. In the human brain, we find that the representational similarity between regular and adversarial images largely echoes their perceptual similarity in all early visual areas. In AlexNet, however, the neural representations of adversarial images are inconsistent with network outputs in all intermediate processing layers, providing no neural foundations for the similarities at the perceptual level. Furthermore, we show that voxel-encoding models trained on regular images can successfully generalize to the neural responses to AI images but not AN images. These remarkable differences between the human brain and AlexNet in representation-perception association suggest that future CNNs should emulate both behavior and the internal neural presentations of the human brain.

neurosciences,mathematical & computational biology
The Neural Representation Benchmark and its Evaluation on Brain and Machine

Charles F. Cadieu,Ha Hong,Dan Yamins,Nicolas Pinto,Najib J. Majaj,James J. DiCarlo

DOI: https://doi.org/10.48550/arXiv.1301.3530

2013-01-15

Neural and Evolutionary Computing

Abstract:A key requirement for the development of effective learning representations is their evaluation and comparison to representations we know to be effective. In natural sensory domains, the community has viewed the brain as a source of inspiration and as an implicit benchmark for success. However, it has not been possible to directly test representational learning algorithms directly against the representations contained in neural systems. Here, we propose a new benchmark for visual representations on which we have directly tested the neural representation in multiple visual cortical areas in macaque (utilizing data from [Majaj et al., 2012]), and on which any computer vision algorithm that produces a feature space can be tested. The benchmark measures the effectiveness of the neural or machine representation by computing the classification loss on the ordered eigendecomposition of a kernel matrix [Montavon et al., 2011]. In our analysis we find that the neural representation in visual area IT is superior to visual area V4. In our analysis of representational learning algorithms, we find that three-layer models approach the representational performance of V4 and the algorithm in [Le et al., 2012] surpasses the performance of V4. Impressively, we find that a recent supervised algorithm [Krizhevsky et al., 2012] achieves performance comparable to that of IT for an intermediate level of image variation difficulty, and surpasses IT at a higher difficulty level. We believe this result represents a major milestone: it is the first learning algorithm we have found that exceeds our current estimate of IT representation performance. We hope that this benchmark will assist the community in matching the representational performance of visual cortex and will serve as an initial rallying point for further correspondence between representations derived in brains and machines.
Probing neural representations of scene perception in a hippocampally dependent task using artificial neural networks

Markus Frey,Christian F. Doeller,Caswell Barry

DOI: https://doi.org/10.48550/arXiv.2303.06367

2023-03-11

Abstract:Deep artificial neural networks (DNNs) trained through backpropagation provide effective models of the mammalian visual system, accurately capturing the hierarchy of neural responses through primary visual cortex to inferior temporal cortex (IT). However, the ability of these networks to explain representations in higher cortical areas is relatively lacking and considerably less well researched. For example, DNNs have been less successful as a model of the egocentric to allocentric transformation embodied by circuits in retrosplenial and posterior parietal cortex. We describe a novel scene perception benchmark inspired by a hippocampal dependent task, designed to probe the ability of DNNs to transform scenes viewed from different egocentric perspectives. Using a network architecture inspired by the connectivity between temporal lobe structures and the hippocampus, we demonstrate that DNNs trained using a triplet loss can learn this task. Moreover, by enforcing a factorized latent space, we can split information propagation into "what" and "where" pathways, which we use to reconstruct the input. This allows us to beat the state-of-the-art for unsupervised object segmentation on the CATER and MOVi-A,B,C benchmarks.

Computer Vision and Pattern Recognition,Artificial Intelligence

An improved fluorimetric assay for brain monoamine oxidase.

Seeing eye-to-eye? A comparison of object recognition performance in humans and deep convolutional neural networks under image manipulation

Humans and deep networks largely agree on which kinds of variation make object recognition harder

A large-scale examination of inductive biases shaping high-level visual representation in brains and machines

Evaluating (and Improving) the Correspondence Between Deep Neural Networks and Human Representations

Dimensions underlying the representational alignment of deep neural networks with humans

Improved object recognition using neural networks trained to mimic the brain's statistical properties

How well do models of visual cortex generalize to out of distribution samples?

Differentiable Optimization of Similarity Scores Between Models and Brains

Evaluating Representational Similarity Measures from the Lens of Functional Correspondence

Using drawings and deep neural networks to characterize the building blocks of human visual similarity

Divergences in color perception between deep neural networks and humans

Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

Comparing object recognition in humans and deep convolutional neural networks -- An eye tracking study

Aligning Machine and Human Visual Representations across Abstraction Levels

Deep Neural Networks and Visuo-Semantic Models Explain Complementary Components of Human Ventral-Stream Representational Dynamics

Variation in the geometry of concept manifolds across human visual cortex

Leveraging the Human Ventral Visual Stream to Improve Neural Network Robustness

Dissociable Neural Representations of Adversarially Perturbed Images in Convolutional Neural Networks and the Human Brain

The Neural Representation Benchmark and its Evaluation on Brain and Machine

Probing neural representations of scene perception in a hippocampally dependent task using artificial neural networks