Exposition and Interpretation of the Topology of Neural Networks

Rickard Brüel Gabrielsson,Gunnar Carlsson
DOI: https://doi.org/10.48550/arXiv.1810.03234
2019-10-18
Abstract:Convolutional neural networks (CNN's) are powerful and widely used tools. However, their interpretability is far from ideal. One such shortcoming is the difficulty of deducing a network's ability to generalize to unseen data. We use topological data analysis to show that the information encoded in the weights of a CNN can be organized in terms of a topological data model and demonstrate how such information can be interpreted and utilized. We show that the weights of convolutional layers at depths from 1 through 13 learn simple global structures. We also demonstrate the change of the simple structures over the course of training. In particular, we define and analyze the spaces of spatial filters of convolutional layers and show the recurrence, among all networks, depths, and during training, of a simple circle consisting of rotating edges, as well as a less recurring unanticipated complex circle that combines lines, edges, and non-linear patterns. We also demonstrate that topological structure correlates with a network's ability to generalize to unseen data and that topological information can be used to improve a network's performance. We train over a thousand CNN's on MNIST, CIFAR-10, SVHN, and ImageNet.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are understanding the working principle and learning process of Convolutional Neural Networks (CNNs) and improving their generalization ability for unseen data. Specifically: 1. **Improving the interpretability of CNNs**: Through the use of Topological Data Analysis (TDA) methods, the paper shows that the information in CNN weights can be organized into simple topological data models. These models can effectively summarize the global structure of the weight configuration space, thus providing in - depth understanding of CNN functions. 2. **Exploring the spatial structure of CNN weights**: The paper analyzes the spatial filters of weights in convolutional layers of different depths and discovers the simple topological structures formed by these filters during the training process, such as Primary Circle and Secondary Circle. These structures show a certain repeatability in different networks, depths, and training processes. 3. **Evaluating the generalization ability of CNNs**: The paper studies the relationship between topological structures and network generalization ability and finds that stronger topological structures usually mean better generalization ability. Through experimental verification, the paper shows how to use topological information to improve network performance, especially when dealing with unseen data. 4. **Improving network performance**: The paper proposes a method to accelerate the network training process and improve the test accuracy by attaching idealized weight features to the input image. Experimental results show that this method is effective on both the MNIST and SVHN datasets, especially on the more complex SVHN dataset. In summary, this paper aims to improve the interpretability of CNNs, reveal their internal structures, and use these structures to enhance the generalization ability and training efficiency of the network through topological data analysis methods.