Reusing Convolutional Neural Network Models through Modularization and Composition

Binhang Qi,Hailong Sun,Hongyu Zhang,Xiang Gao
2023-11-08
Abstract:With the widespread success of deep learning technologies, many trained deep neural network (DNN) models are now publicly available. However, directly reusing the public DNN models for new tasks often fails due to mismatching functionality or performance. Inspired by the notion of modularization and composition in software reuse, we investigate the possibility of improving the reusability of DNN models in a more fine-grained manner. Specifically, we propose two modularization approaches named CNNSplitter and GradSplitter, which can decompose a trained convolutional neural network (CNN) model for $N$-class classification into $N$ small reusable modules. Each module recognizes one of the $N$ classes and contains a part of the convolution kernels of the trained CNN model. Then, the resulting modules can be reused to patch existing CNN models or build new CNN models through composition. The main difference between CNNSplitter and GradSplitter lies in their search methods: the former relies on a genetic algorithm to explore search space, while the latter utilizes a gradient-based search method. Our experiments with three representative CNNs on three widely-used public datasets demonstrate the effectiveness of the proposed approaches. Compared with CNNSplitter, GradSplitter incurs less accuracy loss, produces much smaller modules (19.88% fewer kernels), and achieves better results on patching weak models. In particular, experiments on GradSplitter show that (1) by patching weak models, the average improvement in terms of precision, recall, and F1-score is 17.13%, 4.95%, and 11.47%, respectively, and (2) for a new task, compared with the models trained from scratch, reusing modules achieves similar accuracy (the average loss of accuracy is only 2.46%) without a costly training process. Our approaches provide a viable solution to the rapid development and improvement of CNN models.
Software Engineering
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address the issues of functional mismatch or poor performance encountered when directly reusing existing Convolutional Neural Network (CNN) models for new tasks. Specifically, the authors propose two modular methods (CNNSplitter and GradSplitter) to decompose a trained multi-class CNN model into multiple small reusable modules. Each module is responsible for recognizing a specific category and can be combined to construct new CNN models or to enhance existing weak models. ### Main Challenges 1. **Functional Mismatch**: Existing CNN models may perform poorly on the target task. 2. **Performance Issues**: Even if there is a model that can solve the target task, directly reusing the model with the highest overall accuracy may not be the best choice, as the recognition ability for certain categories may be inferior to other models. ### Solutions 1. **Modularization**: Decompose the trained CNN model into multiple small modules, each responsible for recognizing a specific category. 2. **Combination**: Construct new CNN models or enhance existing weak models by combining these modules to improve model performance. ### Methods - **CNNSplitter**: Uses a genetic algorithm to search for optimal modules by removing unnecessary convolutional kernels to reduce the size of the modules. - **GradSplitter**: Uses a gradient-based search method to generate optimal modules by optimizing masks and heads, further improving efficiency and effectiveness. ### Experimental Results - **Accuracy**: The new models constructed by decomposing and combining modules using GradSplitter had an average accuracy loss of only 0.58% compared to the original models. - **Module Size**: Each module retained an average of 36.88% of the convolutional kernels from the original model. - **Performance Improvement**: By reusing modules to enhance weak models, the average precision, recall, and F1 score increased by 17.13%, 4.95%, and 11.47%, respectively. - **New Tasks**: For new tasks, the new models constructed by reusing modules had similar accuracy to models trained from scratch, with an average accuracy loss of only 2.46%. ### Main Contributions 1. **Proposed Compressed Modular Methods**: Including CNNSplitter and GradSplitter, which can decompose CNN models into reusable modules. 2. **Designed Search Algorithms**: Transformed the CNN modularization problem into a search problem and designed genetic algorithms and gradient-based search methods. 3. **Experimental Validation**: Extensively validated the effectiveness of modularization and combination through experiments, demonstrating its potential in improving model performance and rapidly developing new models. ### Summary The paper effectively addresses the issue of reusing existing CNN models for new tasks through modularization and combination methods, providing a feasible solution for improving model performance and rapidly developing new models.