Abstract:Deep neural network (DNN) models have become increasingly crucial components in intelligent software systems. However, training a DNN model is typically expensive in terms of both time and money. To address this issue, researchers have recently focused on reusing existing DNN models - borrowing the idea of code reuse in software engineering. However, reusing an entire model could cause extra overhead or inherits the weakness from the undesired functionalities. Hence, existing work proposes to decompose an already trained model into modules, i.e., modularizing-after-training, and enable module reuse. Since trained models are not built for modularization, modularizing-after-training incurs huge overhead and model accuracy loss. In this paper, we propose a novel approach that incorporates modularization into the model training process, i.e., modularizing-while-training (MwT). We train a model to be structurally modular through two loss functions that optimize intra-module cohesion and inter-module coupling. We have implemented the proposed approach for modularizing Convolutional Neural Network (CNN) models in this work. The evaluation results on representative models demonstrate that MwT outperforms the state-of-the-art approach. Specifically, the accuracy loss caused by MwT is only 1.13 percentage points, which is 1.76 percentage points less than that of the baseline. The kernel retention rate of the modules generated by MwT is only 14.58%, with a reduction of 74.31% over the state-of-the-art approach. Furthermore, the total time cost required for training and modularizing is only 108 minutes, half of the baseline.

Stable Network Morphism

Network Morphism

Modularized Morphing of Neural Networks

Memorizing morph patterns in small-world neuronal network

Neural Metamorphosis

Stable Architectures for Deep Neural Networks

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Learning Morphisms with Gauss-Newton Approximation for Growing Networks

Optimization Algorithm Inspired Deep Neural Network Structure Design

An Adaptive and Stability-Promoting Layerwise Training Approach for Sparse Deep Neural Network Architecture

Metabolize Neural Network

Stabilize deep ResNet with a sharp scaling factor τ

On Multi-Stage Loss Dynamics in Neural Networks: Mechanisms of Plateau and Descent Stages

Modularizing while Training: A New Paradigm for Modularizing DNN Models

Compressing deep neural networks by matrix product operators

Neural Network Module Decomposition and Recomposition

Stable ResNet

Nonlinear Collaborative Scheme for Deep Neural Networks.

Stability for the training of deep neural networks and other classifiers

Morphological Network: How Far Can We Go with Morphological Neurons?

Learning Stages: Phenomenon, Root Cause, Mechanism Hypothesis, and Implications.