Abstract:Reinforcement learning (RL) models are used extensively to study human behavior. These rely on normative models of behavior and stress interpretability over predictive capabilities. More recently, neural network models have emerged as a descriptive modeling paradigm that is capable of high predictive power yet with limited interpretability. Here, we seek to augment the expressiveness of theoretical RL models with the high flexibility and predictive power of neural networks. We introduce a novel framework, which we term theoretical-RNN (t-RNN), whereby a recurrent neural network is trained to predict trial-by-trial behavior and to infer theoretical RL parameters using artificial data of RL agents performing a two-armed bandit task. In three studies, we then examined the use of our approach to dynamically predict unseen behavior along with time-varying theoretical RL parameters. We first validate our approach using synthetic data with known RL parameters. Next, as a proof-of-concept, we applied our framework to two independent datasets of humans performing the same task. In the first dataset, we describe differences in theoretical RL parameters dynamic among clinical psychiatric vs. healthy controls. In the second dataset, we show that the exploration strategies of humans varied dynamically in response to task phase and difficulty. For all analyses, we found better performance in the prediction of actions for t-RNN compared to the stationary maximum-likelihood RL method. We discuss the use of neural networks to facilitate the estimation of latent RL parameters underlying choice behavior. Currently, neural network models fitted directly to behavioral human data are thought to dramatically outperform theoretical computational models in terms of predictive accuracy. However, these networks do not provide a clear theoretical interpretation of the mechanisms underlying the observed behavior. Generating plausible theoretical explanations for observed human data is a major goal in computational neuroscience. Here, we provide a proof-of-concept for a novel method where a recurrent neural network (RNN) is trained on artificial data generated from a known theoretical model to predict both trial-by-trial actions and theoretical parameters. We then freeze the RNN weights and use it to predict both actions and theoretical parameters of empirical data. We first validate our approach using synthetic data where the theoretical parameters are known. We then show, using two empirical datasets, that our approach allows dynamic estimation of latent parameters while providing better action predictions compared to theoretical models fitted with a maximum-likelihood approach. This proof-of-concept suggests that neural networks can be trained to predict meaningful time-varying theoretical parameters.

Harnessing the flexibility of neural networks to predict dynamic theoretical parameters underlying human choice behavior

Some personality and aptitude characteristics of air traffic control specialist trainees.

Predicting human decision making in psychological tasks with recurrent neural networks

Predicting human decision making in psychological tasks with recurrent neural networks.

Exploration-exploitation mechanisms in recurrent neural networks and human learners in restless bandit problems

RTify: Aligning Deep Neural Networks with Human Behavioral Decisions

Predictive and Interpretable: Combining Artificial Neural Networks and Classic Cognitive Models to Understand Human Learning and Decision Making

Humans rationally balance detailed and temporally abstract world models

Towards Neural Network based Cognitive Models of Dynamic Decision-Making by Humans

A novel technique for delineating the effect of variation in the learning rate on the neural correlates of reward prediction errors in model-based fMRI

Neuron-level prediction and noise can implement flexible reward-seeking behavior

Discovering Cognitive Strategies with Tiny Recurrent Neural Networks

Optimization of human behavior prediction in societal tasks using recurrent neural networks

Neural Mechanisms of Human Decision-Making

Driving Behavior Modeling Using Naturalistic Human Driving Data With Inverse Reinforcement Learning

Human-Level Reinforcement Learning through Theory-Based Modeling, Exploration, and Planning

Hyperglycemia enhances kidney cell injury in HIVAN through down-regulation of vitamin D receptors.

A new theoretical framework jointly explains behavioral and neural variability across subjects performing flexible decision-making

NeuRL: Closed-form Inverse Reinforcement Learning for Neural Decoding

A neural network model for timing control with reinforcement

On the Sensitivity of Reward Inference to Misspecified Human Models