Efficient State Sharding in Blockchain via Density-based Graph Partitioning

Bo Yin,Peng Zhang,Tingxuan Chen
DOI: https://doi.org/10.1145/3697840
IF: 3.35
2024-09-28
ACM Transactions on the Web
Abstract:Sharding is a promising technique for increasing a blockchain system's throughput by enabling parallel transaction processing. The main challenge of state sharding lies in ensuring the atomicity verification of cross-sharding transactions, which results in double communication overhead and increases the transaction's confirmation time. Previous research has primarily focused on developing cross-shard protocols for the fast and reliable validation of transactions involving multiple shards. These studies typically generate a large number of cross-shard transactions because they primarily use simple address mapping for state sharding, that is, the prefix/suffix of the account address. In this paper, we propose a state sharding scheme via density-based partitioning of the account-transaction graph. In order to reduce cross-shard transactions, the scheme groups correlated accounts into the same shard by generating the densest subgraphs, as the graph density describes the correlation among accounts, i.e., how often transactions have occurred among accounts. We formulate the graph density-based state sharding problem, with the goal of maximizing the average density across all shards under the workload constraint. We prove the NP-completeness of the problem. To reduce the complexity of finding the densest subgraph, we propose the pruning-based algorithm that reduces the search space by pre-pruning some invalid edges based on the concept of core number. We also extend the linear deterministic greedy algorithm and PageRank algorithm to handle new transactions in the dynamic scenario. We conduct extensive experiments using real transaction data from Ethereum. The experimental results demonstrate a strong correlation between the shard density and the number of cross-shard transactions, and the pruning-based algorithm can reduce the running time by an order of magnitude.
computer science, information systems, software engineering
What problem does this paper attempt to address?