Abstract:Deep Reinforcement Learning (DRL) has shown outstanding performance on inducing effective action policies that maximize expected long-term return on many complex tasks. Much of DRL work has been focused on sequences of events with discrete time steps and ignores the irregular time intervals between consecutive events. Given that in many real-world domains, data often consists of temporal sequences with irregular time intervals, and it is important to consider the time intervals between temporal events to capture latent progressive patterns of states. In this work, we present a general Time-Aware RL framework: Time-aware Q-Networks (TQN), which takes into account physical time intervals within a deep RL framework. TQN deals with time irregularity from two aspects: 1) elapsed time in the past and an expected next observation time for time-aware state approximation, and 2) action time window for the future for time-aware discounting of rewards. Experimental results show that by capturing the underlying structures in the sequences with time irregularities from both aspects, TQNs significantly outperform DQN in four types of contexts with irregular time intervals. More specifically, our results show that in classic RL tasks such as CartPole and MountainCar and Atari benchmark with randomly segmented time intervals, time-aware discounting alone is more important while in the real-world tasks such as nuclear reactor operation and septic patient treatment with intrinsic time intervals, both time-aware state and time-aware discounting are crucial. Moreover, to improve the agent's learning capacity, we explored three boosting methods: Double networks, Dueling networks, and Prioritized Experience Replay, and our results show that for the two real-world tasks, combining all three boosting methods with TQN is especially effective.

Time‐in‐action RL

Time-Aware Q-Networks: Resolving Temporal Irregularity for Deep Reinforcement Learning

Act Better by Timing: A timing-Aware Reinforcement Learning for Autonomous Driving

When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL

Generalized Reinforcement Learning: Experience Particles, Action Operator, Reinforcement Field, Memory Association, and Decision Concepts

Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control

Simplified Temporal Consistency Reinforcement Learning

Actor-Critic with variable time discretization via sustained actions

Temporal Difference Models: Model-Free Deep RL for Model-Based Control

Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off

Deep Reinforcement Learning With Macro-Actions

Reconstructing Actions To Explain Deep Reinforcement Learning

Safe reinforcement learning under temporal logic with reward design and quantum action selection

Temporal Shift Reinforcement Learning

Interval timing in deep reinforcement learning agents

Human-in-the-Loop Reinforcement Learning in Continuous-Action Space

Human-Level Reinforcement Learning through Theory-Based Modeling, Exploration, and Planning

MAN: Multi-Action Networks Learning

Inferring Time-Varying Internal Models of Agents Through Dynamic Structure Learning

Reinforcement learning under temporal logic constraints as a sequence modelling problem

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning