Abstract:We approach the issue of robust machine vision by presenting a novel deep-learning architecture, inspired by work in theoretical neuroscience on how the primate brain performs visual feature binding. Feature binding describes how separately represented features are encoded in a relationally meaningful way, such as an edge composing part of the larger contour of an object. We propose that the absence of such representations from current models might partly explain their vulnerability to small, often humanly-imperceptible distortions known as adversarial examples. It has been proposed that adversarial examples are a result of 'off-manifold' perturbations of images. Our novel architecture is designed to approximate hierarchical feature binding, providing explicit representations in these otherwise vulnerable directions. Having introduced these representations into convolutional neural networks, we provide empirical evidence of enhanced robustness against a broad range of L0, L2 and L∞ attacks, particularly in the black-box setting. While we eventually report that the model remains vulnerable to a sufficiently powerful attacker (i.e. the defense can be broken), we demonstrate that our main results cannot be accounted for by trivial, false robustness (gradient masking). Analysis of the representational geometry of our architectures shows a positive relationship between hierarchical binding, expanded manifolds, and robustness. Through hyperparameter manipulation, we find evidence that robustness emerges through the preservation of general low-level information alongside more abstract features, rather than by capturing which specific low-level features drove the abstract representation. Finally, we propose how hierarchical binding relates to the observation that, under appropriate viewing conditions, humans show sensitivity to adversarial examples.

Exploring Geometry of Blind Spots in Vision Models

Towards a More Rigorous Science of Blindspot Discovery in Image Classification Models

On Inherent Adversarial Robustness of Active Vision Systems

BlindSpotNet: Seeing Where We Cannot See

Automatic Discovery of Visual Circuits

DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Towards Evaluating the Robustness of Visual State Space Models

Occlusion Sensitivity Analysis with Augmentation Subspace Perturbation in Deep Feature Space

Robustness of 3D Deep Learning in an Adversarial Setting

Reveal of Vision Transformers Robustness against Adversarial Attacks

Hierarchical binding in convolutional neural networks: Making adversarial attacks geometrically challenging

Beyond Sight: Probing Alignment Between Image Models and Blind V1

Now You See Me: Robust approach to Partial Occlusions

On Network Design Spaces for Visual Recognition

Saliency Suppressed, Semantics Surfaced: Visual Transformations in Neural Networks and the Brain

Understanding Neural Networks Through Deep Visualization

Sparse Double Descent in Vision Transformers: real or phantom threat?

Learning Local Distortion Visibility From Image Quality Data-sets

Intriguing Equivalence Structures of the Embedding Space of Vision Transformers

Central and peripheral vision for scene recognition: A neurocomputational modeling exploration

Exploring Robustness of Visual State Space model against Backdoor Attacks