An efficient forward recovery checkpointing scheme in dissimilar redundancy computer system

GuoDong Wang,Zhengjun Zhai,Tao Huang,Kaichen Huang
DOI: https://doi.org/10.1109/CISE.2009.5366252
2009-01-01
Abstract:Roll-Forward Checkpointing Schemes (RFCS) [1,2,3,4] are developed in order to avoid rollback in the presence of independent faults and increase the possibility that a task completes within a tight deadline. But the assumption of RFCS does not exist in most time. Run the same software on the same hardware may result in correlated faults. Another question is these RFCS schemes may lose useful build-in self detection information results in performance degradation. In this paper, we propose a Twice Dissimilar Redundancy Computer based Roll-Forward Recovery scheme (TDCS) that can avoid the correlated faults and realize fault-tolerance, without extra process. At last we use a novel technique based on a Markov Reward Model [5], to reveal our TDCS performance is quite better than the RFCS in average completion time when build-in self detection coverage be high. ©2009 IEEE.
What problem does this paper attempt to address?