Abstract:For over two decades, phasic activity in midbrain dopamine neurons was considered synonymous with the prediction error in temporal-difference reinforcement learning. 1 Schultz W. Dayan P. Montague P.R. A neural substrate of prediction and reward. Science. 1997; 275 : 1593-1599 https://doi.org/10.1126/science.275.5306.1593 Crossref PubMed Scopus (5669) Google Scholar , 2 Waelti P. Dickinson A. Schultz W. Dopamine responses comply with basic assumptions of formal learning theory. Nature. 2001; 412 : 43-48 https://doi.org/10.1038/35083500 Crossref PubMed Scopus (744) Google Scholar , 3 Glimcher P.W. Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proc. Natl. Acad. Sci. USA. 2011; 108 : 15647-15654 https://doi.org/10.1073/pnas.1014269108 Crossref PubMed Scopus (456) Google Scholar , 4 Schultz W. Dopamine reward prediction-error signalling: a two-component response. Nat. Rev. Neurosci. 2016; 17 : 183-195 https://doi.org/10.1038/nrn.2015.26 Crossref PubMed Scopus (381) Google Scholar Central to this proposal is the notion that reward-predictive stimuli become endowed with the scalar value of predicted rewards. When these cues are subsequently encountered, their predictive value is compared to the value of the actual reward received, allowing for the calculation of prediction errors. 5 Sutton R.S. Barto A.G. Toward a modern theory of adaptive networks: expectation and prediction. Psychol. Rev. 1981; 88 : 135-170 https://doi.org/10.1037/0033-295x.88.2.135 Crossref PubMed Scopus (0) Google Scholar , 6 Rescorla R.A. Wagner A.R. A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and nonreinforcement.in: Black A. Prokasy W. Classical Conditioning II: Current Research and Theory. Appleton-Centrury-Crofts , 1972 : 64-99 Google Scholar Phasic firing of dopamine neurons was proposed to reflect this computation, 1 Schultz W. Dayan P. Montague P.R. A neural substrate of prediction and reward. Science. 1997; 275 : 1593-1599 https://doi.org/10.1126/science.275.5306.1593 Crossref PubMed Scopus (5669) Google Scholar , 2 Waelti P. <li class="lo -Abstract Truncated-

Rethinking dopamine as generalized prediction error

Rethinking dopamine as generalized prediction error

Dopamine, Updated: Reward Prediction Error and Beyond

Representation learning with reward prediction errors

Believing in dopamine

Dopamine, Prediction Error and Beyond

Mesolimbic dopamine encodes reward prediction errors independent of learning rates

Dopamine Prediction Errors in Reward Learning and Addiction: From Theory to Neural Circuitry

Dopamine reward prediction error coding

Dopamine, Inference, and Uncertainty

The Dopamine Prediction Error: Contributions to Associative Models of Reward Learning

Dopamine errors drive excitatory and inhibitory components of backward conditioning in an outcome-specific manner

A feature-specific prediction error model explains dopaminergic heterogeneity

Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework

The many worlds hypothesis of dopamine prediction error: implications of a parallel circuit architecture in the basal ganglia

Dopaminergic prediction errors in the ventral tegmental area reflect a multithreaded predictive model

Model-based predictions for dopamine

Dopamine transients encode reward prediction errors independent of learning rates