Learning principle and mathematical realization of the learning mechanism in the brain

Taisuke Katayose
2023-11-22
Abstract:While deep learning has achieved remarkable success, there is no clear explanation about why it works so well. In order to discuss this question quantitatively, we need a mathematical framework that explains what learning is in the first place. After several considerations, we succeeded in constructing a mathematical framework that can provide a unified understanding of all types of learning, including deep learning and learning in the brain. We call it learning principle, and it follows that all learning is equivalent to estimating the probability of input data. We not only derived this principle, but also mentioned its application to actual machine learning models. For example, we found that conventional supervised learning is equivalent to estimating conditional probabilities, and succeeded in making supervised learning more effective and generalized. We also proposed a new method of defining the values of estimated probability using differentiation, and showed that unsupervised learning can be performed on arbitrary dataset without any prior knowledge. Namely, this method is a general-purpose machine learning in the true sense. Moreover, we succeeded in describing the learning mechanism in the brain by considering the time evolution of a fully or partially connected model and applying this new method. The learning principle provides solutions to many unsolved problems in deep learning and cognitive neuroscience.
Machine Learning,Artificial Intelligence,Information Theory,Neurons and Cognition
What problem does this paper attempt to address?
This paper aims to address the reasons for the success of deep learning and propose a unified learning principle to explain all types of learning, including deep learning and the learning mechanisms of the brain. The author builds a mathematical framework, namely the learning principle, through a series of discussions, indicating that all learning fundamentally involves the estimation of input data probabilities. In addition to deriving this principle, the paper also explores its applications in practical machine learning models, such as equating traditional supervised learning with conditional probability estimation and improving their efficiency and generalization ability. Moreover, the paper introduces a new method for estimating probabilities using differential definition, enabling unsupervised learning without any prior knowledge. This method represents a true universal machine learning approach. The paper further describes the learning mechanisms of the brain by considering the temporal evolution of fully or partially connected models, and addresses some unresolved issues in deep learning and cognitive neuroscience.