Block Storage Optimization and Parallel Data Processing and Analysis of Product Big Data Based on the Hadoop Platform

Yajun Wang,Shengming Cheng,Xinchen Zhang,Junyu Leng,Jun Liu
DOI: https://doi.org/10.1155/2021/3839800
IF: 1.43
2021-12-31
Mathematical Problems in Engineering
Abstract:The traditional distributed database storage architecture has the problems of low efficiency and storage capacity in managing data resources of seafood products. We reviewed various storage and retrieval technologies for the big data resources. A block storage layout optimization method based on the Hadoop platform and a parallel data processing and analysis method based on the MapReduce model are proposed. A multireplica consistent hashing algorithm based on data correlation and spatial and temporal properties is used in the parallel data processing and analysis method. The data distribution strategy and block size adjustment are studied based on the Hadoop platform. A multidata source parallel join query algorithm and a multi-channel data fusion feature extraction algorithm based on data-optimized storage are designed for the big data resources of seafood products according to the MapReduce parallel frame work. Practical verification shows that the storage optimization and data-retrieval methods provide supports for constructing a big data resource-management platform for seafood products and realize efficient organization and management of the big data resources of seafood products. The execution time of multidata source parallel retrieval is only 32% of the time of the standard Hadoop scheme, and the execution time of the multichannel data fusion feature extraction algorithm is only 35% of the time of the standard Hadoop scheme.
engineering, multidisciplinary,mathematics, interdisciplinary applications
What problem does this paper attempt to address?