Low-Complexity Code Clone Detection Using Graph-based Neural Networks

Hu Liu,Hui Zhao,Changhao Han,Lu Hou
DOI: https://doi.org/10.1109/msn57253.2022.00129
2022-01-01
Abstract:Code clone detection is of great significance for intellectual property protection and software maintenance. Deep learning has been applied in some research and achieved better performance than traditional methods. To adapt to more application scenarios and improve the detection efficiency, this paper proposes a low-complex code clone detection with the graph- based neural network. As the input of the neural network, code features are represented by abstract syntax trees (ASTs), in which the redundant edges are removed. The operation of pruning avoids interference in the message passing of the network and reduces the size of the graph. Then, the graph pairs for the code clone detection are sent into the message passing neural networks (MPNN). In addition, the gated recurrent unit (GRU) is used to learn the information between graph pairs to avoid the operation of Graph mapping. After multiple iterations, the attention mechanism is used to read out the graph vector, and the cosine similarity is calculated on the graph vector to obtain the code similarity. Through the experiments on two datasets, the results show that the proposed clone detection scheme removes about 20 % of the redundant edges and reduces 25 % of model weights, 16% of multiply-accumulate operations (MACs). In the end, the proposed method effectively reduces the training time of graph neural network while presenting a similar performance to the baseline network.
What problem does this paper attempt to address?