Global Virtual Data Space for Unified Data Access Across Supercomputing Centers

Bing Wei,Limin Xiao,Hanjie Zhou,Guangjun Qin,Yao Song,Chenhao Zhang
DOI: https://doi.org/10.1109/tcc.2022.3164251
IF: 5.697
2022-01-01
IEEE Transactions on Cloud Computing
Abstract:In the wide-area high-performance computing environment, heterogeneous storage resources are geographically distributed in different supercomputing centers, which leads to the barriers between applications and data. This article proposes a global virtual data space, named GVDS, to meet the needs of unified data access across supercomputing centers. GVDS integrates the parallel/distributed file systems of supercomputing centers to present a virtual space with tremendous storage capability for users. GVDS organizes users into groups for easy management, which allows users to share, collaborate, and perform computations on the stored data. For failure tolerance, global metadata is replicated and distributed on multiple supercomputing centers, redundant I/O service components are deployed in each supercomputing center. GVDS uses adaptive prefetching, caching, and request merging to improve access performance. Experimental results running on real-world supercomputing centers show that, GVDS can deliver excellent I/O performance running micro-benchmark, real-world traces and applications.
What problem does this paper attempt to address?