Network Fission Ensembles for Low-Cost Self-Ensembles

Hojung Lee,Jong-Seok Lee
2024-08-05
Abstract:Recent ensemble learning methods for image classification have been shown to improve classification accuracy with low extra cost. However, they still require multiple trained models for ensemble inference, which eventually becomes a significant burden when the model size increases. In this paper, we propose a low-cost ensemble learning and inference, called Network Fission Ensembles (NFE), by converting a conventional network itself into a multi-exit structure. Starting from a given initial network, we first prune some of the weights to reduce the training burden. We then group the remaining weights into several sets and create multiple auxiliary paths using each set to construct multi-exits. We call this process Network Fission. Through this, multiple outputs can be obtained from a single network, which enables ensemble learning. Since this process simply changes the existing network structure to multi-exits without using additional networks, there is no extra computational burden for ensemble learning and inference. Moreover, by learning from multiple losses of all exits, the multi-exits improve performance via regularization, and high performance can be achieved even with increased network sparsity. With our simple yet effective method, we achieve significant improvement compared to existing ensemble methods. The code is available at <a class="link-external link-https" href="https://github.com/hjdw2/NFE" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problem of excessive computational cost faced by traditional ensemble learning methods in image classification tasks. Specifically, although existing ensemble learning methods can improve classification accuracy through the combination of multiple models, they usually need to train and load multiple models, which will bring significant computational burden and memory complexity problems when the model scale increases. #### Summary of main problems: 1. **High consumption of computational resources**: Traditional ensemble learning methods require multiple trained models for ensemble inference. As the model size increases, the consumption of computational resources becomes very large. 2. **High memory complexity**: Since multiple models need to be loaded, the memory requirement also increases accordingly. Especially when dealing with large - scale neural networks, this has extremely high hardware requirements. 3. **Trade - off between performance and sparsity**: Some methods reduce the computational burden through pruning. However, as the pruning rate (sparsity) increases, the performance often declines, and it is difficult to maintain high ensemble performance. To solve these problems, the author proposes a new ensemble learning method with low computational cost - **Network Fission Ensembles (NFE)**. This method converts a regular network into a multi - exit structure, enabling a single network to produce multiple outputs, thereby achieving ensemble learning. This method does not require additional networks or modules, so it does not increase the number of parameters or memory usage, achieving almost zero - cost ensemble learning and inference. #### Main innovation points of NFE: - **Multi - exit structure**: By grouping weights at different stages of the network and creating auxiliary paths (exits), a single network can produce multiple outputs. - **Pruning and weight grouping**: First, the initial network is pruned to reduce the training burden, and then the pruned weights are grouped to form multiple auxiliary paths. - **Low computational cost**: The whole process only changes the calculation flow of the existing network and does not introduce additional networks or parameters, so the computational cost is extremely low. - **Performance improvement**: By learning from multiple losses, the multi - exit structure not only improves the overall performance but also can maintain good performance in the case of a high pruning rate. In conclusion, NFE provides an efficient and low - cost ensemble learning method, which can significantly improve the performance of image classification tasks without significantly increasing computational resources.