Train Trajectory Optimization with High-Risk State Space Boundaries: A Safe Reinforcement Learning Approach

Yalan Chen,Jing Xun,Yafei Liu,Ronghui Liu,Shibo He,Xin Wan,Zicong Zhao
DOI: https://doi.org/10.2139/ssrn.4282959
2022-01-01
Abstract:Online learning about train trajectory optimization is essential for automatic train operation (ATO) on railways when dealing with varying operational environments and constraints. Reinforcement learning (RL) can partly enable autonomous learning skills with objectives and environmental interactions. However, shortcomings in perceiving risks in various operational conditions still exist. This paper proposes a safe RL-based train trajectory optimization approach to address the mentioned issues with high-risk perception and boundaries. To promptly deal with potentially unsafe states and actions of train operation, a high-risk state space (HRSS) is designed and its boundaries are analyzed. This approach avoids setting artificial penalty hyperparameters when safety constraints are violated. With the HRSS, the safe RL algorithm can generate safe actions to modify the exploration process. This enables the ATO to achieve autonomous learning and high-risk avoidance during online optimization. Simulation-based experiments verify that the proposed method can generate optimal train trajectories with guaranteed safety margins within 2s, and improve efficiency compared to the common RL algorithm. Under different online scheduling conditions, the proposed method still guarantees safety and effectiveness.
What problem does this paper attempt to address?