Fast, Scalable Detection of "piggybacked" Mobile Applications

Wu Zhou,Yajin Zhou,Michael C. Grace,Xuxian Jiang,Shihong Zou
DOI: https://doi.org/10.1145/2435349.2435377
2013-01-01
Abstract:Mobile applications (or apps) are rapidly growing in number and variety. These apps provide useful features, but also bring certain privacy and security risks. For example, malicious authors may attach destructive payloads to legitimate apps to create so-called "piggybacked" apps and advertise them in various app markets to infect unsuspecting users. To detect them, existing approaches typically employ pair-wise comparison, which unfortunately has limited scalability. In this paper, we present a fast and scalable approach to detect these apps in existing Android markets. Based on the fact that the attached payload is not an integral part of a given app's primary functionality, we propose a module decoupling technique to partition an app's code into primary and non-primary modules. Also, noticing that piggybacked apps share the same primary modules as the original apps, we develop a feature fingerprint technique to extract various semantic features (from primary modules) and convert them into feature vectors. We then construct a metric space and propose a linearithmic search algorithm (with O(n log n) time complexity) to efficiently and scalably detect piggybacked apps. We have implemented a prototype and used it to study 84,767 apps collected from various Android markets in 2011. Our results show that the processing of these apps takes less than nine hours on a single machine. In addition, among these markets, piggybacked apps range from 0.97% to 2.7% (the official Android Market has 1%). Further investigation shows that they are mainly used to steal ad revenue from the original developers and implant malicious payloads (e.g., for remote bot control). These results demonstrate the effectiveness and scalability of our approach.
What problem does this paper attempt to address?