Optimal Attempt Scheduling and Aborting in Heterogenous System Performing Asynchronous Multi-Attempt Mission

Gregory Levitin,Liudong Xing,Yuanshun Dai
DOI: https://doi.org/10.1016/j.ress.2024.110335
IF: 7.247
2024-01-01
Reliability Engineering & System Safety
Abstract:Multi-attempt mission aborting systems have recently received significant attention from the reliability community. Existing models mostly assume parallel or sequential execution of multiple attempts, incurring great cost or low mission success probability (MSP), respectively. This paper advances the state of the art by considering a new model, where system components may be activated with certain delay allowing to activate next one before the previous component leaves the operation, balancing the expected cost of lost components (ECC) and MSP. Each component may abort its attempt according to an individual aborting policy defined by two parameters (the number of survived shocks and an operation time threshold) or upon receiving a common abort command. Because components may have different shock resistances and performance rates, their activation order can affect both MSP and ECC. Thus, we formulate and solve the optimal attempt scheduling and aborting policy (SAP) problem, which determines the vector of component activation times and the individual attempt aborting policy for each component to minimize the expected mission losses (EML). The EML, a function of MSP and ECC, is evaluated using a new numerical procedure. A detailed case study of a cloud data processing system is provided to demonstrate the proposed model.
What problem does this paper attempt to address?