A Computational Model of Representation Learning in the Brain Cortex, Integrating Unsupervised and Reinforcement Learning

Giovanni Granato,Emilio Cartoni,Federico Da Rold,Andrea Mattera,Gianluca Baldassarre
DOI: https://doi.org/10.48550/arXiv.2106.03688
2021-06-07
Abstract:A common view on the brain learning processes proposes that the three classic learning paradigms -- unsupervised, reinforcement, and supervised -- take place in respectively the cortex, the basal-ganglia, and the cerebellum. However, dopamine outbursts, usually assumed to encode reward, are not limited to the basal ganglia but also reach prefrontal, motor, and higher sensory cortices. We propose that in the cortex the same reward-based trial-and-error processes might support not only the acquisition of motor representations but also of sensory representations. In particular, reward signals might guide trial-and-error processes that mix with associative learning processes to support the acquisition of representations better serving downstream action selection. We tested the soundness of this hypothesis with a computational model that integrates unsupervised learning (Contrastive Divergence) and reinforcement learning (REINFORCE). The model was tested with a task requiring different responses to different visual images grouped in categories involving either colour, shape, or size. Results show that a balanced mix of unsupervised and reinforcement learning processes leads to the best performance. Indeed, excessive unsupervised learning tends to under-represent task-relevant features while excessive reinforcement learning tends to initially learn slowly and then to incur in local minima. These results stimulate future empirical studies on category learning directed to investigate similar effects in the extrastriate visual cortices. Moreover, they prompt further computational investigations directed to study the possible advantages of integrating unsupervised and reinforcement learning processes.
Neurons and Cognition,Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is: how to effectively integrate Unsupervised Learning (UL) and Reinforcement Learning (RL) in the cerebral cortex to support the learning process of sensory representation and action selection. Specifically, the authors propose a hypothesis that the trial-and-error learning process based on rewards can promote not only the learning of motor representations but also the learning of sensory representations. The core points of this hypothesis include: 1. In the cerebral cortex, the trial-and-error learning process based on rewards and the associative learning process coexist. 2. The trial-and-error mechanism for learning non-motor representations is similar to the mechanism for learning motor representations, including exploratory noise and the fixation of effective solutions based on rewards. 3. The combined effect of these associative learning and trial-and-error learning processes can lead to the formation of better action-oriented sensory representations, thereby better serving the downstream action selection process. To validate this hypothesis, the authors constructed a computational model that combines Unsupervised Learning (using the Contrastive Divergence algorithm) and Reinforcement Learning (using the REINFORCE algorithm). The model was tested through a task that required different responses to different visual images (classified by color, shape, or size). The experimental results show that a balanced mix of Unsupervised Learning and Reinforcement Learning can achieve optimal performance, while over-reliance on either learning method leads to performance degradation. These results provide theoretical support for further research on the integration mechanism of sensory representation and action selection in the cerebral cortex.