File Similarity Determination Based on Function Call Graph

Jin Zhang,Dahai Jin,Yunzhan Gong
DOI: https://doi.org/10.1109/ICECOME.2018.8644900
2018-01-01
Abstract:The similarity detection of the program has important significance in code reuse, plagiarism detection, intellectual property protection and information retrieval methods. Attribute counting methods cannot take into account program semantics. The method based on syntax tree or graph structure has a very high construction cost and low space efficiency. So it is difficult to solve problems in large-scale software systems. This paper uses different decision strategies for different levels, then puts forward a similarity detection method at the file level. This method can make full use of the features of the program and take into account the space-time efficiency. By using static analysis methods, we get function features and control flow features of files. And based on this, we establish the function call graph. The similar degree between two files can be measured with the two graphs. Experimental results show the method can effectively detect similar files. Finally, this paper discusses the direction of development of this method.
What problem does this paper attempt to address?