Meta-prediction extends human cortical and subcortical reward learning

Jaehoon Shin,Jee Hang Lee,Sang Wan Lee
DOI: https://doi.org/10.1101/2024.12.08.627441
2024-12-09
Abstract:Human reward learning is constrained by environmental structure. Stable environments facilitate reward prediction but limit learning experiences[1-8], while uncertain environments hinder predictability and learnability[9-12]. We propose a novel framework extending these boundaries through "meta-prediction" - predicting human prediction. The meta-prediction entwines two Bellman equations: one for human prediction and the other for predicting the prediction error of the former. The framework pretrains computational models imitating individuals' reward prediction (specification), then generates new tasks to extremize the models' prediction errors (generalization). Simulations with 82 subjects' data generated subject-independent task design across four scenarios without compromising learnability. In an independent fMRI experiment with 49 participants, meta-prediction guides behavior and neural activities in the ventral striatum, lateral prefrontal, and insular cortex, the areas encoding prediction errors. We also demonstrated that meta-prediction can generate complex tasks compositionally to discern human reward learning bias. Our framework redefines the role of tasks in cognitive science and AI.
Neuroscience
What problem does this paper attempt to address?