Abstract:The purpose of this short and simple note is to clarify a common misconception about convolutional neural networks (CNNs). CNNs are made up of convolutional layers which are shift equivariant due to weight sharing. However, convolutional layers are not translation equivariant, even when boundary effects are ignored and when pooling and subsampling are absent. This is because shift equivariance is a discrete symmetry while translation equivariance is a continuous symmetry. This fact is well known among researchers in equivariant machine learning, but is usually overlooked among non-experts. To minimize confusion, we suggest using the term `shift equivariance' to refer to discrete shifts in pixels and `translation equivariance' to refer to continuous translations.

What problem does this paper attempt to address?

The paper aims to clarify a common misconception in Convolutional Neural Networks (CNNs). Specifically, it points out that convolutional layers possess shift equivariance, meaning they maintain equivariance under discrete pixel shifts, but do not possess translation equivariance under continuous translations. This distinction lies in the difference between discrete and continuous symmetries. While this is a known fact among researchers in equivariant machine learning, it is often overlooked by non-experts. To reduce confusion, the authors suggest clearly distinguishing the terms "shift equivariance" and "continuous translation equivariance," with the former referring to discrete pixel shifts and the latter to continuous image translations. The paper demonstrates through theoretical analysis and examples that convolutional layers cannot achieve true continuous translation equivariance because continuous translation involves transformations in the real number domain, whereas convolutional layers deal with discrete data. Even when ignoring boundary effects and not performing pooling or downsampling operations, convolutional layers still lack continuous translation equivariance. This is because the operations of convolutional layers are based on discrete models, which lack continuous symmetry. Through specific image examples, the authors further intuitively illustrate this point.

Convolutional layers are equivariant to discrete shifts but not continuous translations

Using and Abusing Equivariance

Quantifying Translation-Invariance in Convolutional Neural Networks

Empowering Networks With Scale and Rotation Equivariance Using A Similarity Convolution

Tracking translation invariance in CNNs

Affine Invariance in Continuous-Domain Convolutional Neural Networks

Affine Equivariant Networks Based on Differential Invariants

Learning Color Equivariant Representations

Enabling equivariance for arbitrary Lie groups

Group Equivariant Subsampling

Lie Group Convolution Neural Networks with Scale-Rotation Equivariance

Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations

On the Shift Invariance of Max Pooling Feature Maps in Convolutional Neural Networks

Color Equivariant Convolutional Networks

Scale equivariance in CNNs with vector fields

What is an equivariant neural network?

Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling

RC-CNN: Representation-Consistent Convolutional Neural Networks for Achieving Transformation Invariance

Inability of spatial transformations of CNN feature maps to support invariant recognition

Rotation Equivariance and Invariance in Convolutional Neural Networks

Making Convolutional Networks Shift-Invariant Again