Joint Optimal Checkpointing and Rejuvenation Policy for Real-Time Computing Tasks

Gregory Levitin,Liudong Xing,Liang Luo
DOI: https://doi.org/10.1016/j.ress.2018.10.006
2019-01-01
Abstract:Performance of a software system can deteriorate from higher to lower levels due to software aging. To counteract the aging effect, software rejuvenation is widely implemented to restore the performance of a degraded system before the system crash actually takes place. To facilitate an effective system function restoration after each rejuvenation action, it is desirable to apply checkpointing to occasionally save the system state on a reliable storage so that the mission task can be resumed from the last saved checkpoint (instead of being restarted from the very beginning). As both rejuvenation and checkpointing procedures incur system overhead while bringing these benefits, it is significant to determine the optimal rejuvenation and checkpointing scheduling policy optimizing the system performance measures of interest. This paper makes new contributions by modeling and optimizing the joint maintenance policy involving state-based rejuvenation and periodic checkpointing schedule for software systems performing real-time computing tasks. The system can undergo multiple performance degradation levels or states, and transition time between different states can assume arbitrary types of distributions. The proposed solution methodology encompasses an efficient numerical algorithm for evaluating the probability of task completion (PTC) by a pre-specified deadline. The joint optimal rejuvenation and checkpointing policy is further determined to maximize the PTC of the considered real-time task. Examples are provided to illustrate applications of the proposed methodology as well as effects of system parameters on the optimization solution.
What problem does this paper attempt to address?