Data Analysis and Synchronization on Inter-Continent Data Placement Laboratory

Kun Qian,Zhongzhi Luan,Zuowei Zhang,Hailong Yang,Yaqi Yang,Depei Qian
DOI: https://doi.org/10.1109/ccbd.2015.49
2015-01-01
Abstract:Collaborations among different organizations produce mas-sive amounts of data which are increasingly stored in distributed file systems such as HDFS. Although systems, such as HDFS, are designed to manage different kinds of data, there are still challenges in processing oceans of data from local and remote nodes efficiently, especially dealing with data redundancy and resource consumption to transfer data across different organizations. This paper presents technologies for processing data across different physical locations, which involves data analysis and data synchronization among nodes across continents. We explains the strategies for improving the efficiency when retrieving data in analysis and reducing the resource utilization for data transfer.
What problem does this paper attempt to address?