Information Geometry on Pruning of Neural Network

YH Liu,SW Luo,AJ Li,HB Yu
DOI: https://doi.org/10.1109/icmlc.2004.1380390
2004-01-01
Abstract:The problem of determining the proper size of an artificial neural network is recognized to be crucial. One popular approach is pruning which means training a larger than necessary network and removing unnecessary weights/nodes. Though pruning is commonly used in architecture learning of neural network, there is still no theoretical framework about it. We give an information geometric explanation of pruning. In information geometric framework, most kinds of neural networks form exponential or mixture manifold which has a natural hierarchical structure. In a hierarchical set of systems, a lower order system is included in the parameter space of a large one as a submanifold. Such a parameter space has rich geometrical structures that are responsible for the dynamic behaviors of learning. The pruning problem is formulated in iterative m-projections from the current manifold to its submanifold in which the divergence between the two manifolds is minimized, and it means meaning the network performance does not worsen over the entire pruning process. The result gives a geometric understanding and an information geometric guideline of pruning, which has more authentic theoretic foundation.
What problem does this paper attempt to address?