WuKong: a Scalable and Accurate Two-Phase Approach to Android App Clone Detection.

Haoyu Wang,Yao Guo,Ziang Ma,Xiangqun Chen
DOI: https://doi.org/10.1145/2771783.2771795
2015-01-01
Abstract:Repackaged Android applications (app clones) have been found in many third-party markets, which not only compromise the copyright of original authors, but also pose threats to security and privacy of mobile users. Both fine-grained and coarse-grained approaches have been proposed to detect app clones. However, fine-grained techniques employing complicated clone detection algorithms are difficult to scale to hundreds of thousands of apps, while coarse-grained techniques based on simple features are scalable but less accurate. This paper proposes WuKong, a two-phase detection approach that includes a coarse-grained detection phase to identify suspicious apps by comparing light-weight static semantic features, and a fine-grained phase to compare more detailed features for only those apps found in the first phase. To further improve the detection speed and accuracy, we also introduce an automated clustering-based preprocessing step to filter third-party libraries before conducting app clone detection. Experiments on more than 100,000 Android apps collected from five Android markets demonstrate the effectiveness and scalability of our approach.
What problem does this paper attempt to address?