CCSharp: an Efficient Three-Phase Code Clone Detector Using Modified PDGs

Min Wang,Pengcheng Wang,Yun Xu
DOI: https://doi.org/10.1109/apsec.2017.16
2017-01-01
Abstract:Detecting code clones in software systems is becoming more and more important with the blossom of open source projects. In spite of numerous active researches, there is still a lack of detecting clones especially high-level clones efficiently and accurately. In this paper, we present CCSharp, a three-phase PDG-based clone detector which can detect much more clones besides high-level ones in software systems. To solve the problem of PDG-based tool's high time cost, we adopt two strategies to decrease the overall computing quantity of our tool: PDG's structure modification and characteristic vector filtering. In PDG's structure modification, we propose a novel technique to merge procedure invocation nodes which can make clone detection get rid of influence of procedure's parameters and disguise as well as downscale PDG's structure. We proceed clone detection on both real-world and artificial codebase by CCSharp along with other three state-of-the-art tools. Experiment results show that CCSharp has both high recall and precision, and can detect much more unique clones compared with the other three tools.
What problem does this paper attempt to address?