Abstract:The purpose of this study was to take a new approach in showing how the central nervous system might encode time at the supra-second level using recurrent neural nets (RNNs). This approach utilizes units with a delayed feedback, whose feedback weight determines the temporal properties of specific neurons in the network architecture. When these feedback neurons are coupled, they form a multilayered dynamical system that can be used to model temporal responses to steps of input in multidimensional systems. The timing network was implemented using separate recurrent "Go" and "No-Go" neural processing units to process an individual stimulus indicating the time of reward availability. Outputs from these distinct units on each time step are converted to a pulse reflecting a weighted sum of the separate Go and No-Go signals. This output pulse then drives an integrator unit, whose feedback weight and input weights shape the pulse distribution. This system was used to model empirical data from rodents performing in an instrumental "peak interval timing" task for two stimuli, Tone and Flash. For each of these stimuli, reward availability was signaled after different times from stimulus onset during training. Rodent performance was assessed on non-rewarded trials, following training, with each stimulus tested individually and simultaneously in a stimulus compound. The associated weights in the Go/No-Go network were trained using experimental data showing the mean distribution of bar press rates across an 80 s period in which a tone stimulus signaled reward after 5 s and a flash stimulus after 30 s from stimulus onset. Different Go/No-Go systems were used for each stimulus, but the weighted output of each fed into a final recurrent integrator unit, whose weights were unmodifiable. The recurrent neural net (RNN) model was implemented using Matlab and Matlab's machine learning tools were utilized to train the network using the data from non-rewarded trials. The neural net output accurately fit the temporal distribution of tone and flash-initiated bar press data. Furthermore, a "Temporal Averaging" effect was also obtained when the flash and tone stimuli were combined. These results indicated that the system combining tone and flash responses were not superposed as in a linear system, but that there was a non-linearity, which interacted between tone and flash. In order to achieve an accurate fit to the empirical averaging data it was necessary to implement non-linear "saliency functions" that limited the output signal of each stimulus to the final integrator when the other was co-present. The model suggests that the central nervous system encodes timing generation as a dynamical system whose timing properties are embedded in the connection weights of the system. In this way, event timing is coded similar to the way other sensory-motor systems, such as the vestibulo-ocular and optokinetic systems, which combine sensory inputs from the vestibular and visual systems to generate the temporal aspects of compensatory eye movements.

Interval timing in deep reinforcement learning agents

Time‐in‐action RL

Learning about reward identities and time

A neural network model for timing control with reinforcement

Episodic Memory for Learning Subjective-Timescale Models

Time-Aware Q-Networks: Resolving Temporal Irregularity for Deep Reinforcement Learning

Emergence of Adaptive Circadian Rhythms in Deep Reinforcement Learning

A Biologically-Inspired Computational Model of Time Perception

Optimizing Agent Behavior over Long Time Scales by Transporting Value

Modeling Interval Timing by Recurrent Neural Nets

Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making

Expected reward value and reward prediction errors reinforce but also interfere with human time perception

TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation

Volitional Modulation of Temporal Spiking Patterns Uncovers the Ability of Temporal Coding in Abstract Skills Learning

Alternative time representation in dopamine models

An Idiosyncrasy of Time-discretization in Reinforcement Learning

Understanding the computation of time using neural network models

Time-scale invariant contingency yields one-shot reinforcement learning despite extremely long delays to reinforcement

Delays in Reinforcement Learning

Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off

Timing Patterns in the Extended Basal Ganglia System