Storage-Optimization Method for Massive Small Files of Agricultural Resources Based on Hadoop

Jun Liu,
DOI: https://doi.org/10.20965/jaciii.2019.p0634
2019-07-20
Journal of Advanced Computational Intelligence and Intelligent Informatics
Abstract:The main function of Hadoop is the storage and processing of big data, especially the processing of large datasets. However, in practice, there are numerous small files, and Hadoop has many flaws when dealing with these small files. A storage-optimization method for numerous agricultural resource small files based on Hadoop is proposed, using the precursor and subsequent relationship between different small files of agricultural resources to merge small files. By accessing small files and performing metadata caching through an index mechanism, as well as the prefetching mechanism of associated small files, the storage-optimization method improves the reading efficiency. Experimental results show that this method reduces the memory consumption of the Hadoop name node and improves the performance of the system.
What problem does this paper attempt to address?