Global neural encoding of model-free and inference-based strategies in mice

Shuo Wang,Huayi Gao,Kotaro Ishizu,Akihiro Funamizu
DOI: https://doi.org/10.1101/2024.02.08.579559
2024-02-09
Abstract:When a simple model-free strategy does not provide sufficient outcomes, an inference-based strategy estimating a hidden task structure becomes essential for optimizing choices. However, the neural circuitry involved in inference-based strategies is still unclear. We developed a tone frequency discrimination task in head-fixed mice in which the tone category of the current trial depended on the category of the previous trial. When the tone category was repeated every trial, the mice continued to use the default model-free strategy, as well as when tone was randomly presented, to bias the choices. In contrast, the default strategy gradually shifted to an inference-based strategy when the tone category was alternated in each trial. Brain-wide electrophysiological recording during the overtrained phase suggested that the neural activity of the frontal and sensory cortices, hippocampus, and striatum was correlated with the reward expectation of both the model-free and inference-based strategies. These results suggest the global encoding of multiple strategies in the brain.
Neuroscience
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to understand the neural coding mechanisms in the brain during different decision - making strategies, especially the differences and connections between the model - free strategy and the inference - based strategy. Specifically, through designing a frequency discrimination task, the researchers observed the behavioral changes of mice when facing different task conditions, and analyzed the neural activity patterns in various regions of the brain through electrophysiological recordings, in order to reveal the manifestations of these strategies in the brain. ### Background and Problem Description of the Paper 1. **Perceptual Decision - making and Hidden Context Estimation** - Perceptual decision - making requires estimating the hidden context from sensory inputs to optimize behavior. Signal Detection Theory (SDT) points out that in order to optimize behavior, the subject needs not only to estimate the expected outcome (value) of each context, but also to estimate the probability of context change over time (context - switching probability). 2. **Model - Free Reinforcement Learning (MFRL)** - A simple strategy is to estimate and update the value of each choice from past direct experiences through trial and error, which can be achieved by model - free reinforcement learning. MFRL does not consider context - switching. 3. **Inference - Based Strategy** - When the context has a certain dependency or structure, a simple MFRL model relying solely on value estimation cannot optimize the choice. At this time, a strategy based on an internal context - switching probability model becomes important, and this strategy is called the inference - based strategy. ### Research Purposes - **Explore the Neural Coding of Decision - making Strategies in Different Regions of the Brain** - This study aims to explore through experiments and electrophysiological recordings how different regions of the brain encode the model - free strategy and the inference - based strategy. Specifically, the researchers designed a frequency discrimination task, changed the task conditions (repetition condition and alternation condition), observed the behavioral changes of mice, and recorded the neural activities in multiple regions of the brain. - **Compare the Neural Mechanisms of the Two Strategies** - The researchers hope to understand through experimental data whether the strategies used by the brain under different conditions are the same and the specific manifestations of these strategies in the brain. For example, the researchers found that the neural activities in regions such as the prefrontal cortex, sensory cortex, hippocampus, and striatum are related to reward expectation, indicating that the brain encodes multiple strategies on a global scale. ### Experimental Design and Main Findings - **Task Design** - Mice were trained to complete a frequency discrimination task at a fixed position. The task was divided into a neutral condition (p = 0.5), a repetition condition (p = 0.2), and an alternation condition (p = 0.9). In the neutral condition, tone categories were randomly presented; in the repetition condition, tone categories were more likely to be repeated; in the alternation condition, tone categories were more likely to be alternated. - **Behavioral Analysis** - In the neutral condition, mice showed a bias towards repeating the choice that was rewarded last time. In the repetition condition, mice quickly learned the bias of repeating the choice; while in the alternation condition, mice needed more time to learn the bias of alternating the choice. - **Electrophysiological Recordings** - The researchers used Neuropixels probes to record the neural activities in the orbitofrontal cortex (OFC), posterior parietal cortex (PPC), hippocampus (HPC), auditory cortex (AC), primary motor cortex (M1), and striatum (STR). The results showed that the neural activities in these regions were related to reward expectation and showed different patterns under different conditions. ### Conclusion - This study has revealed the neural coding mechanisms in the brain during different decision - making strategies, especially between the model - free strategy and the inference - based strategy. The research results indicate that different regions of the brain encode multiple strategies on a global scale, which provides a new perspective for understanding the decision - making process in complex environments.