High Effective Two-round Remote File Fast Synchronization Algorithm

XU Dan,SHENG Yonghong,JU Dapeng,WU Jianping,WANG Dongsheng
DOI: https://doi.org/10.3778/j.issn.1673-9418.2011.01.004
2011-01-01
Abstract:Fast remote file synchronization has a widespread application in many scenarios such as the file backup and recovery, Web and ftp site mirroring, content distribution network, Web access and so on. This paper presents a high effective two-round fast synchronization algorithm tpsync which combines content-based variable-sized chunk and fixed-sized sliding block methods. tpsync is implemented with two rounds. For the first round, tpsync adopts content-based variable-sized chunk to locate the local change between similar files in coarse-grained scale. In the second round, tpsync looks up the differential data in the local changed data segment with fixed-sized sliding block method in fine-grained scale, and finally achieves the file synchronization by two-round data interaction. This paper executes a comparison experiment between tpsync and the traditional single-round synchronization method rsync. Extensive experiments on text, binary and database files demonstrate that tpsync can achieve a higher performance on average synchronization time and the amount of network traffic data than rsync.
What problem does this paper attempt to address?