Abstract:Addressing the critical challenge of ensuring data integrity in decentralized systems, this paper delves into the underexplored area of data falsification probabilities within Merkle Trees, which are pivotal in blockchain and Internet of Things (IoT) technologies. Despite their widespread use, a comprehensive understanding of the probabilistic aspects of data security in these structures remains a gap in current research. Our study aims to bridge this gap by developing a theoretical framework to calculate the probability of data falsification, taking into account various scenarios based on the length of the Merkle path and hash length. The research progresses from the derivation of an exact formula for falsification probability to an approximation suitable for cases with significantly large hash lengths. Empirical experiments validate the theoretical models, exploring simulations with diverse hash lengths and Merkle path lengths. The findings reveal a decrease in falsification probability with increasing hash length and an inverse relationship with longer Merkle paths. A numerical analysis quantifies the discrepancy between exact and approximate probabilities, underscoring the conditions for the effective application of the approximation. This work offers crucial insights into optimizing Merkle Tree structures for bolstering security in blockchain and IoT systems, achieving a balance between computational efficiency and data integrity.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to evaluate the probability of data forgery in Merkle trees in the Internet of Things (IoT) to ensure data integrity and authenticity. Specifically, the research aims to develop a theoretical framework to calculate the probability of data forgery and consider the influence of Merkle path length and hash length on this probability in different scenarios. ### Research Background With the popularization of IoT devices, the security and integrity of data have become crucial. As an efficient method for verifying the integrity of large - capacity data structures, Merkle trees have been widely used in blockchain and IoT technologies. However, there are relatively few studies on the probability of data forgery in Merkle trees in these application scenarios. Therefore, this research fills this gap and provides a theoretical basis for optimizing the Merkle tree structure, thereby enhancing the security of blockchain and IoT systems. ### Main Problem Description The core problem of the paper can be formalized as follows: Given a Merkle tree, when a data block \(D_i\) is replaced with \(D'_i\) while other data blocks remain unchanged, find the probability \(P(R = R')\) that the root node \(R\) remains unchanged. The specific formula is as follows: \[P_{\text{falsification}}=P(R = R')\] where, \[N^{(m)}_{\text{par}} = H(N^{(m - 1)}_{\text{par}}, N^{(m - 1)}_{\text{sib}})\] \[N'^{(m)}_{\text{par}} = H(N'^{(m - 1)}_{\text{par}}, N'^{(m - 1)}_{\text{sib}})\] Assume that the hash function \(H\) behaves like a random oracle, then the probability that two different inputs produce the same output is: \[P(H(D_i)=H(D'_i))=\frac{1}{2^b}\] ### Research Method In order to accurately calculate the probability of data forgery, the paper derives the following formula: For any positive integer \(m\), the probability of data forgery when the Merkle path length is \(m\) is: \[P_{\text{falsification}}=\sum_{k = 0}^{m}\left(\frac{1}{2^b}\right)^{k + 1}\left(1-\frac{1}{2^b}\right)^{m - k}\] After simplification, we get: \[P_{\text{falsification}}=1-\left(1-\frac{1}{2^b}\right)^{m + 1}\] For a relatively large hash length \(b\), an approximate formula can be used: \[P_{\text{falsification}}\approx1 - e^{-\frac{m + 1}{2^b}}\] ### Experimental Verification Experimental verification was carried out through a Python program, and the results show that: 1. As the hash length \(b\) increases, the probability of data forgery decreases significantly. 2. As the Merkle path length \(m\) increases, the probability of data forgery increases somewhat. These findings are helpful for optimizing the design of Merkle trees in practical applications and balancing security and computational efficiency. ### Conclusion This research not only provides theoretical support but also proves its effectiveness through experiments. This is of great significance for applications in ensuring data integrity and authenticity in blockchain and IoT systems.

Evaluating the Security of Merkle Trees in the Internet of Things: An Analysis of Data Falsification Probabilities

Enhanced Security and Efficiency in Blockchain With Aggregated Zero-Knowledge Proof Mechanisms

An optimized transaction verification method for trustworthy blockchain-enabled IIoT

Adaptive Restructuring of Merkle and Verkle Trees for Enhanced Blockchain Scalability

Blockchain-based verification framework for data integrity in edge-cloud storage

Blockchain-based Dynamic Cloud Data Integrity Auditing Via Non-leaf Node Sampling of Rank-based Merkle Hash Tree

Data Security Storage Mechanism Based on Blockchain Network

Multimedia Fusion Privacy Protection Algorithm Based on IoT Data Security under Network Regulations

MtMR: Ensuring MapReduce Computation Integrity with Merkle Tree-Based Verifications

Secure Distributed Estimation Against Data Integrity Attacks in Internet-of-Things Systems

Merkle tree-blockchain-assisted privacy preservation of electronic medical records on offering medical data protection through hybrid heuristic algorithm

Merkle Tree and Blockchain-Based Cloud Data Auditing

Authenticating Spatial Queries on Blockchain Systems

Cloud Storage Integrity at Scale: A Case for Dynamic Hash Trees

MTTBA- A Key Contributor for Sustainable Energy Consumption Time and Space Utility for Highly Secured Crypto Transactions in Blockchain Technology

A trusted mechanism against device reputation attacks in Web‐of‐Things applications

Data security enhancement in internet of things using optimised hashing algorithm

Multiple Layer Public Blockchain Approach for Internet of Things (IoT) Systems

Approaches to Secure Inference in the Internet of Things: Performance Bounds, Algorithms, and Effective Attacks on IoT Sensor Networks

Detecting Neural Trojans Through Merkle Trees