LDPP: A Learned Directory Placement Policy in Distributed File Systems.
Yuanzhang Wang,Fengkui Yang,Ji Zhang,Ke Zhou,Chunhua Li,Chong Liu,Zhuo Cheng,Wei Fang,Jinhu Liu
DOI: https://doi.org/10.1145/3545008.3545057
2022-01-01
Abstract:Load balance is a critical problem in distributed file systems. Previous works focus on how to distribute data evenly on different nodes or storage devices from the perspective of file level, but neglect to effectively take advantage of the directory’s locality and the long duration of the directory’s hotness, which may affect the degree of balance and cause performance degradation. To overcome this shortcoming, in this paper, we propose a learning-based directory placement policy, called LDPP, which determines the data layout by predicting the load. We first establish a relationship between directory request characteristics and state information to predict the state information of the directory (storage capacity, bandwidth, and IOPS). Then, the new directory is placed on different nodes in a multi-dimensional manner based on the Manhattan distance according to the predicted multidimensional state information. In addition, we also take into account the trade-off between the same category directory classified by the load prediction module and the peer directories and explore their influence on the balance. Extensive experiments demonstrate that LDPP not only efficiently alleviates load imbalance and increases the utilization of the resources but also improves DFS performance in practice, which can reduce service latency by up to 36 and increase IOPS and bandwidth by 8 and 9, respectively.