Efficient Data Aggregation and Duplicate Removal Using Grid-Based Hashing in Cloud-Assisted Industrial IoT

Saleh M. Altowaijri
DOI: https://doi.org/10.1109/access.2024.3471952
IF: 3.9
2024-10-12
IEEE Access
Abstract:Industrial Internet of Things (IIoT) involves the incorporation of sensors, devices, and equipment with internet connectivity and data processing abilities. This connectivity allows sensors to collect and exchange data and communicate with each other. The proliferation of sensors and data-producing devices in the industrial sector has led to high volumes of data, causing data duplication and other issues in storage, processing, and resource consumption. To address data duplication becomes vital to guarantee effective operation and resource optimization. Existing data deduplication techniques in IIoT environments often struggle with efficiency and scalability. Current deduplication approaches may involve frequent and unnecessary hash generation for minor variations in sensor readings, resulting in excessive computational overhead, higher energy consumption, and reduced network lifetime. These limitations highlight the need to propose the Grid Hashing-based Efficient Data Aggregation (GH-EDA) scheme, a comprehensive solution that uses effective data aggregation, preprocessing, and region splitting, and employs an Extended Merkle Grid for efficient deduplication. The scheme begins with the aggregation of sensor data, followed by preprocessing steps to filter out irrelevant or noisy data. Subsequently, the data is partitioned into regions and refined to improve resource utilization, thereby enabling fast duplicate detection while minimizing the number of comparisons. Key features of the proposed scheme include a threshold-based approach to hash generation, guaranteeing that only substantial changes produce new hash values. Extensive simulations are conducted using Network Simulator-3. The performance of the proposed scheme is evaluated using metrics such as space reduction, search time, network lifetime, computation time, average latency, and energy utilization. Comparisons with existing techniques demonstrate the superior performance of the GH-EDA.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?