Abstract:Humans can implicitly learn complex perceptuo-motor skills over the course of large numbers of trials. This likely depends on our becoming better able to take advantage of ever richer and temporally deeper predictive relationships in the environment. Here, we offer a novel characterization of this process, fitting a non-parametric, hierarchical Bayesian sequence model to the reaction times of human participants' responses over ten sessions, each comprising thousands of trials, in a serial reaction time task involving higher-order dependencies. The model, adapted from the domain of language, forgetfully updates trial-by-trial, and seamlessly combines predictive information from shorter and longer windows onto past events, weighing the windows proportionally to their predictive power. As the model implies a posterior over window depths, we were able to determine how, and how many, previous sequence elements influenced individual participants' internal predictions, and how this changed with practice. Already in the first session, the model showed that participants had begun to rely on two previous elements (i.e., trigrams), thereby successfully adapting to the most prominent higher-order structure in the task. The extent to which local statistical fluctuations influenced participants' responses waned over subsequent sessions, as subjects forgot the trigrams less and evidenced skilled performance. By the eighth session, a subset of participants shifted their prior further to consider a context deeper than two previous elements. Finally, participants showed resistance to interference and slow forgetting of the old sequence when it was changed in the final sessions. Model parameters for individual subjects covaried appropriately with independent measures of working memory. In sum, the model offers the first principled account of the adaptive complexity and nuanced dynamics of humans' internal sequence representations during long-term implicit skill learning. A central function of the brain is to predict. One challenge of prediction is that both external events and our own actions can depend on a variably deep temporal context of previous events or actions. For instance, in a short motor routine, like opening a door, our actions only depend on a few previous ones (e.g., push the handle if the key was turned). In longer routines such as coffee making, our actions require a deeper context (e.g., place the moka pot on the hob if coffee is ground, the pot is filled and closed, and the hob is on). We adopted a model from the natural language processing literature that matches humans' ability to learn variable-length relationships in sequences. This model explained the gradual emergence of more complex sequence knowledge and individual differences in an experiment where humans practiced a perceptual-motor sequence over 10 weekly sessions.

Learning Movement Sequences with a Delayed Reward Signal in a Hierarchical Model of Motor Function.

Hierarchical Dynamical Models of Motor Function

Imitation Learning of Hierarchical Driving Model: from Continuous Intention to Continuous Trajectory

Self-organizing Continuous Attractor Networks and Motor Function

Hierarchical Reinforcement Learning, Sequential Behavior, and the Dorsal Frontostriatal System

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

A Simplified Cerebellar Model with Priority-based Delayed Eligibility Trace Learning for Motor Control

Reinforcement learning using a continuous time actor-critic framework with spiking neurons

Mechanisms of Hierarchical Reinforcement Learning in Corticostriatal Circuits 1: Computational Analysis

Motor primitive and sequence self-organization in a hierarchical recurrent neural network

Tracking human skill learning with a hierarchical Bayesian sequence model

Contributions of the basal ganglia to action sequence learning and performance

Age-dependent predictors of effective reinforcement motor learning across childhood

Brain-like neural dynamics for behavioral control develop through reinforcement learning

Continuous Online Sequence Learning with an Unsupervised Neural Network Model

Reward-driven adaptation of movements requires strong recurrent basal ganglia-cortical loops

Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

Hierarchical behavior control by a single class of interneurons

Thunderstruck: The ACDC model of flexible sequences and rhythms in recurrent neural circuits

Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning

Hierarchical reinforcement learning and decision making