Stable Network Morphism

Tao Wei,Changhu Wang,Chang Wen Chen
DOI: https://doi.org/10.1109/ijcnn.2019.8851955
2019-01-01
Abstract:Deep neural networks perform better when they are deeper. Network morphism is one of the paradigms to construct deeper neural networks. It makes developing deeper neural networks building on existing ones possible by morphing a well-trained neural network into a new one with the network function completely preserved. The morphed network also has the potential to continue growing into a more powerful one as it has more parameters. Existing network morphism schemes include Net2Net and NetMorph. However, both of them suffer from significant initial performance drop when the morphed network is continually trained. Such unstability is very much undesired for a continual learning system. In this research, we first identify the reason for the unstability, which is due to the large amount of zeros padded into the parameters. Based on this observation, we propose an algorithm based on modified gradient descent to decompose the network morphism equation. As a result, the morphed parameters are all non-zeros and the continual training process become stable. Experimental results on benchmark datasets demonstrate the effectiveness of the proposed stable network morphism scheme.
What problem does this paper attempt to address?