Opening the black box of deep learning

Dian Lei,Xiaoxiao Chen,Jianfei Zhao
DOI: https://doi.org/10.48550/arXiv.1805.08355
2018-05-22
Abstract:The great success of deep learning shows that its technology contains profound truth, and understanding its internal mechanism not only has important implications for the development of its technology and effective application in various fields, but also provides meaningful insights into the understanding of human brain mechanism. At present, most of the theoretical research on deep learning is based on mathematics. This dissertation proposes that the neural network of deep learning is a physical system, examines deep learning from three different perspectives: microscopic, macroscopic, and physical world views, answers multiple theoretical puzzles in deep learning by using physics principles. For example, from the perspective of quantum mechanics and statistical physics, this dissertation presents the calculation methods for convolution calculation, pooling, normalization, and Restricted Boltzmann Machine, as well as the selection of cost functions, explains why deep learning must be deep, what characteristics are learned in deep learning, why Convolutional Neural Networks do not have to be trained layer by layer, and the limitations of deep learning, etc., and proposes the theoretical direction and basis for the further development of deep learning now and in the future. The brilliance of physics flashes in deep learning, we try to establish the deep learning technology based on the scientific theory of physics.
Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is the "black box" issue of deep learning. Specifically, although deep learning has achieved significant success in fields such as image classification, speech recognition, and natural language processing, its internal mechanisms still lack clear theoretical explanations. The paper points out that most current theoretical research on deep learning is primarily based on mathematical methods and lacks an understanding of physical laws. Therefore, the authors propose viewing deep learning as a physical system and exploring its principles from three different perspectives: microscopic, macroscopic, and the worldview of physics. The core objectives of the paper include: 1. **Establishing a deep learning model based on physical theories**: Explaining various phenomena in deep learning, such as convolution computation, pooling, normalization, and the working mechanism of the Restricted Boltzmann Machine, through principles of quantum mechanics and statistical physics. 2. **Explaining why deep learning must be "deep"**: Investigating what features deep networks learn, why convolutional neural networks do not require layer-by-layer training, and the limitations of deep learning. 3. **Providing theoretical guidance**: Offering theoretical direction and foundation for the further development of deep learning technology, emphasizing the importance of physics, and attempting to describe the conceptual model of deep learning in physical terms. In summary, the paper aims to reveal the intrinsic mechanisms of deep learning through physical methods, thereby providing solid theoretical support for future research and development.