An Optimized Learning-Based Directory Placement Policy with Two-Rounds Selection in Distributed File Systems
Yuanzhang Wang,Fengkui Yang,Ke Zhou,Chunhua Li,Chong Liu,Ji Zhang,Zhuo Cheng
DOI: https://doi.org/10.1016/j.future.2023.12.012
IF: 7.307
2023-01-01
Future Generation Computer Systems
Abstract:Load balancing is a critical problem in distributed file systems. Previous works focus on achieving data distribution across nodes at the file-level, often overlooking the potential benefits derived from exploiting the directory locality and the long duration of the directory hotness. This oversight may affect the balance and cause performance degradation. To overcome these shortcomings, in this paper, we propose an optimized learning-based directory placement policy with two-rounds selection named OLDP which determines the data layout by predicting the load. Specifically, we establish a relationship between directory request features and state information to predict the state information of the directory (storage capacity, bandwidth, and IOPS). Then, we propose a two-rounds selection multidimensional resource allocation policy in hybrid storage to place the directory. On the one hand, it combines the trade-off between the same category directory and the peer directory, on the other hand, it avoids overloading the nodes with fast devices. Extensive experiments demonstrate that OLDP not only efficiently alleviates load imbalance but also improves performance in practice. Specifically, in a hybrid storage system, service latency, IOPS, and bandwidth improvements are 16%, 26%, and 25% compared to the state-of-the-art method, respectively. In a practical all-flash storage system, OLDP reduces service latency by 36% and increases IOPS and bandwidth by 8% and 9%.