Assessing the Similarity of Smart Contracts by Clustering their Interfaces

G. Salzer,Monika Di Angelo
DOI: https://doi.org/10.1109/TrustCom50675.2020.00261
2020-12-01
Abstract:Like most programs, smart contracts offer their functionality via entry points that constitute the interface. Interface standards, e.g. for tokens contracts, foster interoperability. Ethereum is the most prominent platform for smart contracts. The number of contract deployments approaches 30 million, corresponding to roughly 300 000 distinct contract codes. In view of these numbers, it is necessary to develop automated methods for classifying contracts regarding their purpose, if one aims at a qualitative and quantitative understanding of what blockchain applications are used for at large. We approach the task by considering contracts as similar if their interfaces are. We encode interfaces and their interrelationships as graphs and explore several algorithms regarding their ability to find clusters of functionally similar contracts. Our evaluation of the quality of clustering relies on a ground truth of token and wallet contracts identified in earlier work. Our analysis is based on the bytecodes deployed on the main chain of Ethereum up to block 10.5 million, mined on July 21, 2020.
Computer Science
What problem does this paper attempt to address?