Abstract:Modern deep convolutional neural networks(CNNs) are often designed to be scalable, leading to the model family concept. A model family is a large (possibly infinite) collection of related neural network architectures. The isomorphism of a model family refers to the fact that the models within it share the same high-level structure. Meanwhile, the models within the model family are called isomorphic models for each other. Existing weight initialization methods for CNNs use random initialization or data-driven initialization. Even though these methods can perform satisfactory initialization, the isomorphism of model families is rarely explored. This work proposes an isomorphic model-based initialization method (IM Init) for CNNs. It can initialize any network with another well-trained isomorphic model in the same model family. We first formulate the widely used general network structure of CNNs. Then a structural weight transformation is presented to transform the weight between two isomorphic models. Finally, we apply our IM Init to the model down-sampling and up-sampling scenarios and confirm its effectiveness in improving accuracy and convergence speed through experiments on various image classification datasets. In the model down-sampling scenario, IM Init initializes the smaller target model with a larger well-trained source model. It improves the accuracy of RegNet200MF by 1.59% on the CIFAR-100 dataset and 1.9% on the CUB200 dataset. Inversely, IM Init initializes the larger target model with a smaller well-trained source model in the model up-sampling scenario. It significantly speeds up the convergence of RegNet600MF and improves the accuracy by 30.10% under short training schedules. Code will be available.

How To Initialize The Cnn For Small Datasets: Extracting Discriminative Filters From Pre-Trained Model

Convolutional Initialization for Data-Efficient Vision Transformers

Isomorphic Model-Based Initialization for Convolutional Neural Networks

Learning Structure and Strength of CNN Filters for Small Sample Size Training

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters

Initialization Matters: On the Benign Overfitting of Two-Layer ReLU CNN with Fully Trainable Layers

Overfitting Remedy by Sparsifying Regularization on Fully-Connected Layers of CNNs.

Image Denoising Based On A Cnn Model

A Study on Training Fine-Tuning of Convolutional Neural Networks.

Neural Network Pruning with Residual-Connections and Limited-Data

Compression of Convolutional Neural Networks With Divergent Representation of Filters

Structured Receptive Fields in CNNs

Adaptive Signal Variances: CNN Initialization Through Modern Architectures

Can we learn better with hard samples?

Understanding the Initial Condensation of Convolutional Neural Networks

The Impact of Reinitialization on Generalization in Convolutional Neural Networks

Unsupervised Pre-Trained Filter Learning Approach for Efficient Convolution Neural Network.

Catch-Up Mix: Catch-Up Class for Struggling Filters in CNN

On filter design in deep convolutional neural network