ResMLP: Feedforward Networks for Image Classification With Data-Efficient Training

Hugo Touvron,Piotr Bojanowski,Mathilde Caron,Matthieu Cord,Alaaeldin El-Nouby,Edouard Grave,Gautier Izacard,Armand Joulin,Gabriel Synnaeve,Jakob Verbeek,Herve Jegou
DOI: https://doi.org/10.1109/tpami.2022.3206148
IF: 23.6
2023-03-11
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We also train ResMLP models in a self-supervised setup, to further remove priors from employing a labelled dataset. Finally, by adapting our model to machine translation we achieve surprisingly good results. We share pre-trained models and our code based on the Timm library.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?