AppDNA: Profiling App Behavior via Deep-Learning Function Call Graphs

Anran Li,Shuangshuang Xue,Xiang-Yang Li,Lan Zhang,Jianwei Qian
DOI: https://doi.org/10.1109/tetc.2020.3026335
2022-01-01
IEEE Transactions on Emerging Topics in Computing
Abstract:The growing number and diversity of applications make malware detection and app recommendation for users more challenging. In this work, we design a framework AppDNA to automatically generate a compact representation for each app to comprehensively profile its behaviors. The versatile representation can be generated once for each app, and then be used for a wide variety of objectives, including malware detection, app categorization and app version detection, etc. We propose to conduct a function-call-graph-based app profiling scheme based on a comprehensive and deep understanding of an app's behaviors. We design a graph-encoding method to convert a large function call graph to a 64-dimensional fixed length vector to achieve robust app profiling. Our extensive evaluations on 86,332 apps demonstrate that our approach performs app profiling with high accuracy and low computation cost: it takes about 46.5 seconds for one app to extract its function call graph; 0.68 seconds to encode a function call graph; it classifies all 4,024 (benign/malware) apps in around 5.06 seconds with accuracy about 93.07 percent; it classifies all 570 malicious apps' family (21 families in total) in around 0.83 seconds with accuracy 82.3 percent; it classifies 9,730 apps' functionality into 2 categories with accuracy 88.1 percent or into 7 categories with accuracy 33.3 percent.
computer science, information systems,telecommunications
What problem does this paper attempt to address?