A deep reinforcement learning method for multi-stage equipment development planning in uncertain environments

Peng Liu,Boyuan Xia,Zhiwei Yang,Jichao Li,Yuejin Tan
DOI: https://doi.org/10.23919/JSEE.2022.000140
IF: 1.363
2022-01-01
Journal of Systems Engineering and Electronics
Abstract:Equipment development planning (EDP) is usually a long-term process often performed in an environment with high uncertainty. The traditional multi-stage dynamic programming cannot cope with this kind of uncertainty with unpredictable situations. To deal with this problem, a multi-stage EDP model based on a deep reinforcement learning (DRL) algorithm is proposed to respond quickly to any environmental changes within a reasonable range. Firstly, the basic problem of multi-stage EDP is described, and a mathematical planning model is constructed. Then, for two kinds of uncertainties (future capabi lity requirements and the amount of investment in each stage), a corresponding DRL framework is designed to define the environment, state, action, and reward function for multi-stage EDP. After that, the dueling deep Q-network (Dueling DQN) algorithm is used to solve the multi-stage EDP to generate an approximately optimal multi-stage equipment development scheme. Finally, a case of ten kinds of equipment in 100 possible environments, which are randomly generated, is used to test the feasibility and effectiveness of the proposed models. The results show that the algorithm can respond instantaneously in any state of the multi-stage EDP environment and unlike traditional algorithms, the algorithm does not need to re-optimize the problem for any change in the environment. In addition, the algorithm can flexibly adjust at subsequent planning stages in the event of a change to the equipment capability requirements to adapt to the new requirements.
What problem does this paper attempt to address?