Abstract:We aim to bridge the gap between our common-sense few-sample human learning and large-data machine learning. We derive a theory of human-like few-shot learning from von-Neuman-Landauer's principle. modelling human learning is difficult as how people learn varies from one to another. Under commonly accepted definitions, we prove that all human or animal few-shot learning, and major models including Free Energy Principle and Bayesian Program Learning that model such learning, approximate our theory, under Church-Turing thesis. We find that deep generative model like variational autoencoder (VAE) can be used to approximate our theory and perform significantly better than baseline models including deep neural networks, for image recognition, low resource language processing, and character recognition.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to bridge the gap between human (or animal) few - shot learning and machine learning that relies on a large amount of data. Specifically, the author aims to establish a theoretical framework to describe and simulate human - style few - shot learning and prove that several existing major models (such as the free - energy principle and Bayesian program learning) can approximate this theoretical framework.
### Main problems:
1. **Modeling human learning mechanisms**: How to simulate the way humans learn, especially when humans can quickly learn with only a few samples when facing new tasks.
2. **Diversity problem**: Different individuals use different parameters, features, and even algorithms when learning, so a framework that can cover this diversity is required.
3. **Limitations of existing models**: Current deep - learning methods usually rely on a large amount of labeled data or data augmentation for specific tasks, which is inconsistent with the natural learning methods of humans and animals.
### Solutions:
- **Theoretical basis**: Starting from the von Neumann - Landauer principle, the author derives an optimal few - shot learning model and proves that all other models (including the human learning model) can be regarded as approximations of this theory.
- **Compression mechanism**: The concept of information distance \(E(x, y)\) is introduced to measure the similarity between two objects. This distance can be defined by reversible computing, ensuring the theoretical optimality of the model.
- **Experimental verification**: Through tasks such as image recognition, low - resource language processing, and character recognition, the superior performance of the generative model based on the variational auto - encoder (VAE) in few - shot learning is verified.
### Conclusions:
The author proposes a new theoretical framework that can not only explain and simulate human few - shot learning but also demonstrate its effectiveness in practical applications. In addition, the author explores the understanding of consciousness in this theory, proposes the concept of an "interestingness" classifier, and the importance of the labeling ability in obtaining certain consciousness.
### Formula summary:
- **Energy function**:
\[
E_U(x, y)=\min\{|p|: U(x, p) = y, U(y, p)=x\}
\]
where \(U\) is a universal Turing machine or the brain, and \(p\) is a reversible transformation program.
- **Information distance theorem**:
\[
E(x, y)=\max\{K(x|y), K(y|x)\}+O(1)
\]
where \(K(x|y)\) is the Kolmogorov complexity of \(x\) given \(y\).
Through these formulas and theoretical derivations, the author provides a brand - new perspective to understand and simulate the human few - shot learning mechanism.