Transfer entropy-based feedback improves performance in artificial neural networks

Sebastian Herzog,Christian Tetzlaff,Florentin Wörgötter
DOI: https://doi.org/10.48550/arXiv.1706.04265
2017-06-22
Abstract:The structure of the majority of modern deep neural networks is characterized by uni- directional feed-forward connectivity across a very large number of layers. By contrast, the architecture of the cortex of vertebrates contains fewer hierarchical levels but many recurrent and feedback connections. Here we show that a small, few-layer artificial neural network that employs feedback will reach top level performance on a standard benchmark task, otherwise only obtained by large feed-forward structures. To achieve this we use feed-forward transfer entropy between neurons to structure feedback connectivity. Transfer entropy can here intuitively be understood as a measure for the relevance of certain pathways in the network, which are then amplified by feedback. Feedback may therefore be key for high network performance in small brain-like architectures.
Machine Learning,Information Theory,Neural and Evolutionary Computing
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to improve the performance of small artificial neural networks by introducing feedback connections, so that they can achieve performance comparable to that of large feed - forward networks in image classification tasks**. ### Problem Background Modern deep neural networks usually have a large number of hierarchical structures and single - forward - feed connections, while the cerebral cortex of vertebrates contains fewer layers but has many recurrent and feedback connections. This architectural difference enables the brain to achieve high performance in a shallower hierarchical structure, while existing deep - learning models rely on very deep hierarchical structures (such as 152 layers) to achieve similar effects. ### Main Contributions of the Paper The authors proposed a method based on **Transfer Entropy (TE)** to define feedback connections and applied it to a small convolutional neural network (AlexNet). Through this method, the authors successfully improved the performance of this network on the standard image classification benchmark task (CIFAR - 10 dataset), making it close to or even surpass larger and more complex feed - forward networks. ### Specific Methods 1. **Training the Feed - forward Network**: First, train a feed - forward version of AlexNet (FF - AlexNet) using the standard back - propagation algorithm and record the activation of all nodes. 2. **Calculating Transfer Entropy**: Based on the trained network, calculate the transfer entropy between each node to evaluate the importance of different paths. 3. **Defining Feedback Connections**: Define the weights of feedback connections according to the magnitude of transfer entropy. Specifically, if the average transfer entropy between two nodes is lower than a certain threshold \(\Phi\), then establish a feedback connection between these two nodes, and the connection weight is inversely proportional to the inter - layer distance. 4. **Running the Network with Feedback**: Add the feedback connections to the network, re - run the entire dataset, and evaluate its performance. ### Experimental Results The experimental results show that after introducing feedback connections, the classification performance of the network has been significantly improved, from 85% to 95%, and this improvement is robust and can maintain good performance under different parameter settings. In addition, the feedback connections also increase the ability of local information storage in the network, further proving their effectiveness. ### Conclusion This research shows that through reasonable design of feedback connections, the performance of small neural networks can be significantly improved without increasing network complexity. This provides new ideas for future research on more efficient deep - learning models closer to biological neural systems. ### Key Formulas The transfer entropy \(T_{i\rightarrow j}\) is defined as follows: \[ T_{i \to j}(d) = \int_{X_q} \int_{X_q} \int_{X_q} dy_i(t - d) dy_j(t - d) dy_i(t)\cdot p[y_j(t), y_j(t - d), y_i(t - d)]\log\left(\frac{p[y_j(t)|(y_j(t - d), y_i(t - d))]}{p[y_j(t)|y_j(t - d)]}\right) \] For discrete - time systems, the simplified transfer entropy formula is: \[ T_{i \to j}(d) = \sum_{X_q} p[y_j(t_n), y_j(t_n - d), y_i(t_n - d)]\log\left(\frac{p[y_j(t_n)|(y_j(t_n - d), y_i(t_n - d))]}{p[y_j(t_n)|y_j(t_n - d)]}\right) \] The final formula for calculating the feedback connection weights is: \[ f_{j\beta\rightarrow i\alpha} = \begin{cases} \frac{w_{\text{min}}|\beta - \alpha|}{L} & \text{if }\tilde{T}_{i\alpha\rightarrow j\beta}<\Phi \\ 0 & \text{else} \end{cases} \] where \(w_{\text{min}}\) is the minimum weight in the feed - forward network, and \(L\)