Cost-Effective Data Placement in Edge Storage Systems with Erasure Code

Hai Jin,Ruikun Luo,Qiang He,Song Wu,Zilai Zeng,Xiaoyu Xia
DOI: https://doi.org/10.1109/tsc.2022.3152849
IF: 11.019
2023-01-01
IEEE Transactions on Services Computing
Abstract:Edge computing, as a new computing paradigm, brings cloud computing's computing and storage capacities to network edge for providing low latency services for users. The networked edge servers in a specific area constitute edge storage systems (ESSs), where popular data can be stored to serve the users in the area. The novel ESSs raise many new opportunities as well as unprecedented challenges. Most existing studies of ESSs focus on the storage of data replicas in the system to ensure low data retrieval latency for users. However, replica-based edge storage strategies can easily incur high storage costs. It is not cost-effective to store massive replicas of large-size data, especially those that do not require real-time access at the edge, e.g., system upgrade files, popular app installation files, videos in online games. It may not even be possible due to the constrained storage resources on edge servers. In this article, we make the first attempt to investigate the use of erasure codes in cost-effective data storage at the edge. The focus is to find the optimal strategy for placing coded data blocks on the edge servers in an ESS, aiming to minimize the storage cost while serving all the users in the system. We first model this novel Erasure Coding based Edge Data Placement (EC-EDP) problem as an integer linear programming problem and prove its $\mathcal {NP}$NP-hardness. Then, we propose an optimal approach named EC-EDP-O based on integer programming. Another approximation algorithm named EC-EDP-V is proposed to address the high computation complexity of large-scale EC-EDP scenarios efficiently. The extensive experimental results demonstrate that EC-EDP-O and EC-EDP-V can save an average of 68.58% (and up to 81.16% in large-scale scenarios) storage cost compared with replica-based storage approaches.
What problem does this paper attempt to address?