CCGIR: Information retrieval-based code comment generation method for smart contracts

Guang Yang,Ke Liu,Xiang Chen,Yanlin Zhou,Chi Yu,Hao Lin
DOI: https://doi.org/10.1016/j.knosys.2021.107858
2022-02-01
Abstract:A smart contract is a computer program, which is intended to automatically execute, control or document legally relevant events and actions according to the terms of a contract. About 10% of the security vulnerabilities in smart contracts are caused by misuse of codes without comments. Therefore, there is a need to design effective automatic code comment generation methods for smart contracts. In this study, we propose an information retrieval-based code comment generation method CCGIR for smart contracts. Since code clones are common in smart contract development, CCGIR finds the most similar code in the code repository and reuses its comment through an information retrieval approach from three aspects: semantic similarity, lexical similarity, and syntactic similarity of smart contract codes. We select a corpus, which contains 57,676 unique pairs of <method, comment> from 40,932 real-world smart contracts, as our experimental subject. Then we conduct empirical studies to evaluate the effectiveness of our proposed method. Experimental results show that CCGIR can outperform nine state-of-the-art baselines in terms of three performance measures. Moreover, we perform a human study to further verify that CCGIR can generate higher quality comments. Finally, we find CCGIR can achieve promising performance on the other two code comment generation tasks (i.e., code comment generation for Java and code comment generation for Python). Due to the simplicity and effectiveness of our proposed method, we recommend researchers can use our proposed method as the baseline when evaluating their proposed novel code comment generation methods.
computer science, artificial intelligence
What problem does this paper attempt to address?