Abstract:With the widespread success of deep learning technologies, many trained deep neural network (DNN) models are now publicly available. However, directly reusing the public DNN models for new tasks often fails due to mismatching functionality or performance. Inspired by the notion of modularization and composition in software reuse, we investigate the possibility of improving the reusability of DNN models in a more fine-grained manner. Specifically, we propose two modularization approaches named CNNSplitter and GradSplitter, which can decompose a trained convolutional neural network (CNN) model for $N$-class classification into $N$ small reusable modules. Each module recognizes one of the $N$ classes and contains a part of the convolution kernels of the trained CNN model. Then, the resulting modules can be reused to patch existing CNN models or build new CNN models through composition. The main difference between CNNSplitter and GradSplitter lies in their search methods: the former relies on a genetic algorithm to explore search space, while the latter utilizes a gradient-based search method. Our experiments with three representative CNNs on three widely-used public datasets demonstrate the effectiveness of the proposed approaches. Compared with CNNSplitter, GradSplitter incurs less accuracy loss, produces much smaller modules (19.88% fewer kernels), and achieves better results on patching weak models. In particular, experiments on GradSplitter show that (1) by patching weak models, the average improvement in terms of precision, recall, and F1-score is 17.13%, 4.95%, and 11.47%, respectively, and (2) for a new task, compared with the models trained from scratch, reusing modules achieves similar accuracy (the average loss of accuracy is only 2.46%) without a costly training process. Our approaches provide a viable solution to the rapid development and improvement of CNN models.

Initialization of CNN Models for Training on a Small Dataset Using Importance of Filter Parameters

How To Initialize The Cnn For Small Datasets: Extracting Discriminative Filters From Pre-Trained Model

Learning Structure and Strength of CNN Filters for Small Sample Size Training

Exploring the parameter reusability of CNN

Refining Architectures of Deep Convolutional Neural Networks

Convolutional Initialization for Data-Efficient Vision Transformers

Learning Efficient Convolutional Networks Through Network Slimming.

Building Efficient CNNs Using Depthwise Convolutional Eigen-Filters (DeCEF)

Learning Sparse Features with Lightweight ScatterNet for Small Sample Training

Importance-Aware Filter Selection for Convolutional Neural Network Acceleration

Full-Stack Filters to Build Minimum Viable CNNs

Learning to Prune Filters in Convolutional Neural Networks

An Entropy-based Pruning Method for CNN Compression

Reusing Convolutional Neural Network Models through Modularization and Composition

Optimizing Convolutional Neural Network Architecture

Isomorphic Model-Based Initialization for Convolutional Neural Networks

A pilot study of novel multi-filter CNN layer

Puppet-CNN: Input-Adaptive Convolutional Neural Networks with Model Compression using Ordinary Differential Equation

Structured Receptive Fields in CNNs

Learning Filter Scale and Orientation In CNNs