From Local to Global Semantic Clone Detection

Yuan,Weiqiang Kong,Gang Hou,Yan Hu,Masahiko Watanabe,Akira Fukuda
DOI: https://doi.org/10.1109/dsa.2019.00012
2020-01-01
Abstract:Clone detection detects similar code fragments (refer to as clone code) in software products. It can help with software optimization and maintenance. Code clone detection can be divided into textual, lexical, syntactic and semantic levels. The existing technologies have achieved many good results in the first three levels, but no significant results have been obtained in semantic clone detection. In this paper, we propose a novel semantic level clone detection approach. We use the control flow graph (CFG) as an intermediate representation of the program method, combining the classical dynamic time warping (DTW) algorithm in the field of speech recognition with two deep neural network models (bidirectional RNN autoencoder and graph convolutional network (GCN)) to detect semantic level clone from local to global. We experimented with a dataset consisting of five large-scale real-world systems and a code corpus containing a large number of programming problems. The experimental results show that our approach can achieve good results in detecting both local and global semantic clone.
What problem does this paper attempt to address?