Value-Driven Adaptations of Mesolimbic Dopamine Release Are Governed by Both Model-Based and Model-Free Mechanisms

Rhiannon Robke,Tara Arbab,Rachel Smith,Ingo Willuhn
DOI: https://doi.org/10.1523/ENEURO.0223-24.2024
2024-07-03
eNeuro
Abstract:The magnitude of dopamine signals elicited by rewarding events and their predictors is updated when reward value changes. It is actively debated how readily these dopamine signals adapt and whether adaptation aligns with model-free or model-based reinforcement-learning principles. To investigate this, we trained male rats in a pavlovian-conditioning paradigm and measured dopamine release in the nucleus accumbens core in response to food reward (unconditioned stimulus) and reward-predictive conditioned stimuli (CS), both before and after reward devaluation, induced via either sensory-specific or nonspecific satiety. We demonstrate that (1) such devaluation reduces CS-induced dopamine release rapidly, without additional pairing of CS with devalued reward and irrespective of whether the devaluation was sensory-specific or nonspecific. In contrast, (2) reward devaluation did not decrease food reward-induced dopamine release. Surprisingly, (3) postdevaluation reconditioning, by additional pairing of CS with devalued reward, rapidly reinstated CS-induced dopamine signals to predevaluation levels. Taken together, we identify distinct, divergent adaptations in dopamine-signal magnitude when reward value is decreased: CS dopamine diminishes but reinstates fast, whereas reward dopamine is resistant to change. This implies that, respective to abovementioned findings, (1) CS dopamine may be governed by a model-based mechanism and (2) reward dopamine by a model-free one, where (3) the latter may contribute to swift reinstatement of the former. However, changes in CS dopamine were not selective for sensory specificity of reward devaluation, which is inconsistent with model-based processes. Thus, mesolimbic dopamine signaling incorporates both model-free and model-based mechanisms and is not exclusively governed by either.
What problem does this paper attempt to address?