SQL-DFS: A Massive Small File Storage System Based on HDFS
MA Zhiqiang,YANG Shuangtao,YAN Rui,ZHANG Zeguang
DOI: https://doi.org/10.11936/bjutxb2015060040
2016-01-01
Abstract:In order to solve the problem of high occupancy rate of NameNode memory while using Hadoop distributed file system ( HDFS ) to store massive small files, this paper analyzed the HDFS storage structure and presented a SQL-DFS file system based on metadata storage cluster. In SQL-DFS, in order to move small file metadata from NameNode memory to metadata storage cluster a small file processing module was added in NameNode. In order to improve the reading and writing speed of the metadata, relational database cluster was used, and in order to reduce the time of request for NameNode the reading process of the small file was optimized. To further reduce the load pressure of NameNode, the checking of file block from DataNode was completed by metadata storage cluster. Finally the contrast experiments were carried out between HDFS and SQL-DFS experimental platform. The experimental results show that SQL-DFS in the file average cost ( FAC) and memory occupancy rate are significantly better than that of the original HDFS architecture and has better small file storage capacity. It can be used for the storage of massive small files.
What problem does this paper attempt to address?