Abstract:Deep reinforcement learning (DRL) has demonstrated significant potential in industrial manufacturing domains such as workshop scheduling and energy system management. However, due to the model's inherent uncertainty, rigorous validation is requisite for its application in real-world tasks. Specific tests may reveal inadequacies in the performance of pre-trained DRL models, while the "black-box" nature of DRL poses a challenge for testing model behavior. We propose a novel performance improvement framework based on probabilistic automata, which aims to proactively identify and correct critical vulnerabilities of DRL systems, so that the performance of DRL models in real tasks can be improved with minimal model modifications. First, a probabilistic automaton is constructed from the historical trajectory of the DRL system by abstracting the state to generate probabilistic decision-making units (PDMUs), and a reverse breadth-first search (BFS) method is used to identify the key PDMU-action pairs that have the greatest impact on adverse outcomes. This process relies only on the state-action sequence and final result of each trajectory. Then, under the key PDMU, we search for the new action that has the greatest impact on favorable results. Finally, the key PDMU, undesirable action and new action are encapsulated as monitors to guide the DRL system to obtain more favorable results through real-time monitoring and correction mechanisms. Evaluations in two standard reinforcement learning environments and three actual job scheduling scenarios confirmed the effectiveness of the method, providing certain guarantees for the deployment of DRL models in real-world applications.

Towards Solving Industrial Sequential Decision-making Tasks under Near-predictable Dynamics via Reinforcement Learning: an Implicit Corrective Value Estimation Approach

Human operator decision support for highly transient industrial processes: a reinforcement learning approach

Adaptive Disassembly Sequence Planning for VR Maintenance Training Via Deep Reinforcement Learning

Probabilistic Automata-Based Method for Enhancing Performance of Deep Reinforcement Learning Systems

Towards Variance Reduction for Reinforcement Learning of Industrial Decision-making Tasks: A Bi-Critic Based Demand-Constraint Decoupling Approach.

Hierarchical Reinforcement Learning for Multi-Objective Real-Time Flexible Scheduling in a Smart Shop Floor

Solving Inventory Management Problems Through Deep Reinforcement Learning

Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs

An immediate-return reinforcement learning for the atypical Markov decision processes

Deep reinforcement learning applied to an assembly sequence planning problem with user preferences

Deep Reinforcement Learning for Dynamic Flexible Job Shop Scheduling with Random Job Arrival

Deep reinforcement learning driven inspection and maintenance planning under incomplete information and constraints

Can Deep Reinforcement Learning Improve Inventory Management? Performance on Dual Sourcing, Lost Sales and Multi-Echelon Problems

Controlling Estimation Error in Reinforcement Learning via Reinforced Operation

Distributional Reinforcement Learning for Scheduling of Chemical Production Processes

Deep reinforcement learning for dynamic distributed job shop scheduling problem with transfers

Bridging the gap between Markowitz planning and deep reinforcement learning

Reinforcement Learning Based Decision Making of Operational Indices in Process Industry Under Changing Environment

Reinforcement Learning Approaches for the Orienteering Problem with Stochastic and Dynamic Release Dates

Dynamic Measurement Scheduling for Event Forecasting using Deep RL

Deep Reinforcement Learning for Solving Management Problems: Towards A Large Management Mode