Adolescent and adult mice use both incremental reinforcement learning and short term memory when learning concurrent stimulus-action associations

Juliana B. Chase,Liyu Xia,Lung-Hao Tai,Wan Chen Lin,Anne G.E. Collins,Linda Wilbrecht

DOI: https://doi.org/10.1101/2024.04.29.591768

2024-05-01

Abstract:Computational modeling has revealed that human research participants use both rapid working memory (WM) and incremental reinforcement learning (RL) (RL+WM) to solve a simple instrumental learning task, relying on WM when the number of stimuli is small and supplementing with RL when the number of stimuli exceeds WM capacity. Inspired by this work, we examined which learning systems and strategies are used by adolescent and adult mice when they first acquire a conditional associative learning task. In a version of the human RL+WM task translated for rodents, mice were required to associate odor stimuli (from a set of 2 or 4 odors) with a left or right port to receive reward. Using logistic regression and computational models to analyze the first 200 trials per odor, we determined that mice used both incremental RL and stimulus-insensitive, one-back strategies to solve the task. While these one-back strategies may be a simple form of short-term or working memory, they did not approximate the boost to learning performance that has been observed in human participants using WM in a comparable task. Adolescent and adult mice also showed comparable performance, with no change in learning rate or softmax beta parameters with adolescent development and task experience. However, reliance on a one-back perseverative, win-stay strategy increased with development in males in both odor set sizes. Our findings advance a simple conditional associative learning task and new models to enable the isolation and quantification of reinforcement learning alongside other strategies mice use while learning to associate stimuli with rewards within a single behavioral session. These data and methods can inform and aid comparative study of reinforcement learning across species.

Neuroscience

What problem does this paper attempt to address?

The problem that this paper attempts to solve is whether adolescent and adult mice use short - term memory or other strategies together with reinforcement learning to learn new stimulus - action associations when they first learn conditional association tasks. Specifically, the researchers want to verify the following hypotheses: 1. **Whether mice use working memory when learning tasks with a low set size**: According to human studies, when the number of stimuli is small, people tend to use the fast but capacity - limited working memory (WM) to solve problems, and when the number of stimuli exceeds the working - memory capacity, they will supplementally use step - by - step reinforcement learning (RL). Therefore, the researchers predict that mice will also show similar characteristics of working - memory use when learning tasks with a low set size. 2. **Changes in learning strategies during the development of puberty in mice**: Based on previous studies on rodents and humans, the researchers assume that as they grow older, mice will become more dependent on working memory in learning tasks. To verify these hypotheses, the researchers designed a mouse version similar to the human RL + WM task. By analyzing the behavior of mice in the first 200 trials, they used logistic regression and computational models to determine the specific learning strategies used by mice. The research results show that mice do use step - by - step reinforcement learning as well as a simple "one - trial" strategy to complete the task, which may be a primitive form of short - term - memory dependence but is not exactly the same as human working memory. In addition, the study also found that male mice are more likely to use a "one - trial" persistent strategy during their growth process, while this change was not observed in female mice. Through these findings, the researchers hope to further understand the similarities and differences in the learning mechanisms used by mice and humans when learning simple tasks and provide methodological support for cross - species comparison.

Adolescent and adult mice use both incremental reinforcement learning and short term memory when learning concurrent stimulus-action associations

Effects of Stimulus Timing on the Acquisition of an Olfactory Working Memory Task in Head-Fixed Mice

Learning Set Formation and Reversal Learning in Mice During High-Throughput Home-Cage-Based Olfactory Discrimination

A global dopaminergic learning rate enables adaptive foraging across many options

Challenges and advanced concepts for the assessment of learning and memory function in mice

Nonlinear age-related differences in probabilistic learning in mice: A 5-armed bandit task study

Learning-Induced Odor Modulation of Neuronal Activity in Auditory Cortex

Performance errors during rodent learning reflect a dynamic choice strategy

Pupil Correlates of Decision Variables in Mice Playing a Competitive Mixed-Strategy Game

Striatal dopamine signals reflect perceived cue–action–outcome associations in mice

Identifying the factors governing internal state switches during nonstationary sensory decision-making

Examining Object Location and Object Recognition Memory in Mice

Cognitive training of mice attenuates age-related decline in associative learning and behavioral flexibility

Distentangling the systems contributing to changes in learning during adolescence

How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis

Piriform cortical glutamatergic and GABAergic neurons express coordinated plasticity for whisker-induced odor recall

Striatal dopamine reflects individual long-term learning trajectories

Looking beyond the standard version of the Morris water task in the assessment of mouse models of cognitive deficits

Enhanced generalization of auditory conditioned fear in juvenile mice.

Change point estimation by the mouse medial frontal cortex during probabilistic reward learning