Design and Implementation of a Low-Overhead File Checkpointing Approach

Dan Pei,Dongsheng Wang,Meiming Shen,Weimin Zheng
DOI: https://doi.org/10.1109/HPC.2000.846594
2000-01-01
Abstract:One of checkpointing and recovery technique's important capabilities is file checkpointing, i.e., to save and restore the state of user files of the process. This paper describes the design and implementation of a file checkpointing approach called Modification Operation Buffering. This approach buffers all the modification operations after a checkpoint until the next one, making all the operations between two checkpoints atomic as a whole. By choosing a suitable size dynamically for memory buffer, and by hiding the latency of flushing the buffer, this approach achieved an overhead lower than other approaches.
What problem does this paper attempt to address?