Optimal Backup in Heterogeneous Standby Systems Exposed to Shocks

Gregory Levitin,Maxim Finkelstein
DOI: https://doi.org/10.1016/j.ress.2017.04.022
IF: 7.247
2017-01-01
Reliability Engineering & System Safety
Abstract:The paper considers non-repairable 1-out-of-N heterogeneous warm standby computing systems with components exposed to internal failures and external shocks. To provide the data recovery in the case of operating component failure, the backup procedures are performed during the computational mission. The backups enable an activated standby component to take over the mission task from the point where the last backup has been completed without redoing the entire task from scratch. Both data backup and retrieval times depend on the amount of work performed. The system components are characterized by a different performance level, replacement time, time-to-internal failure distribution, and shocks survival probability. The shock processes also have different characteristics for different components. A numerical method is proposed to evaluate mission success probability for a given allowed mission time and expected mission completion time. The optimal backup scheduling problem is then formulated and solved for different optimization objectives and constraints.
What problem does this paper attempt to address?