Convolutional layers are equivariant to discrete shifts but not continuous translations

Nick McGreivy,Ammar Hakim
2023-12-07
Abstract:The purpose of this short and simple note is to clarify a common misconception about convolutional neural networks (CNNs). CNNs are made up of convolutional layers which are shift equivariant due to weight sharing. However, convolutional layers are not translation equivariant, even when boundary effects are ignored and when pooling and subsampling are absent. This is because shift equivariance is a discrete symmetry while translation equivariance is a continuous symmetry. This fact is well known among researchers in equivariant machine learning, but is usually overlooked among non-experts. To minimize confusion, we suggest using the term `shift equivariance' to refer to discrete shifts in pixels and `translation equivariance' to refer to continuous translations.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper aims to clarify a common misconception in Convolutional Neural Networks (CNNs). Specifically, it points out that convolutional layers possess shift equivariance, meaning they maintain equivariance under discrete pixel shifts, but do not possess translation equivariance under continuous translations. This distinction lies in the difference between discrete and continuous symmetries. While this is a known fact among researchers in equivariant machine learning, it is often overlooked by non-experts. To reduce confusion, the authors suggest clearly distinguishing the terms "shift equivariance" and "continuous translation equivariance," with the former referring to discrete pixel shifts and the latter to continuous image translations. The paper demonstrates through theoretical analysis and examples that convolutional layers cannot achieve true continuous translation equivariance because continuous translation involves transformations in the real number domain, whereas convolutional layers deal with discrete data. Even when ignoring boundary effects and not performing pooling or downsampling operations, convolutional layers still lack continuous translation equivariance. This is because the operations of convolutional layers are based on discrete models, which lack continuous symmetry. Through specific image examples, the authors further intuitively illustrate this point.