A Checkpoint-based Rollback Recovery and Process Migration System

Dong-sheng WANG,Mei-ming SHEN,Wei-min ZHENG,Dan PEI
1999-01-01
Ruan Jian Xue Bao/Journal of Software
Abstract:To implement the fault tolerant performance in hardware and software for cluster computer and load balance, the checkpoint based rollback recovery and process migration system (ChaRM) was presented. The faults of networks of workstation (NOW) can be recovered by checkpoint and rollback recovery. The basic design idea and architecture of the ChaRM system was described. The checkpoint and rollback recovery (CRR) technique and mechanism of process migration were discussed. The results of performance evaluation for some typical scientific computation were given.
What problem does this paper attempt to address?