Channel Capacity of Neural Networks

Gen Ye,Tong Lin
DOI: https://doi.org/10.1007/978-3-030-86380-7_21
2021-01-01
Abstract:Occam's Razor principle suggests preference for simpler models and triggers an enduring question: what is the proper definition of complexity of a model? In this work, we regard neural networks as communication channels and measure the complexity of neural networks by means of their channel capacity-the maximum information reserved in the output of a neural network. Furthermore, we show a connection between the L2-norm of the weight matrix of the linear model and its channel capacity through the singular values of the weight matrix. On image classification problems, we find regularizing different neural networks by constraining their channel capacity effectively boosts the generalization performance and outperforms other information-theoretic regularization methods.
What problem does this paper attempt to address?