Detecting Android Malware and Classifying Its Families in Large-scale Datasets

Bo Sun,Takeshi Takahashi,Tao Ban,Daisuke Inoue
DOI: https://doi.org/10.1145/3464323
2022-06-30
ACM Transactions on Management Information Systems
Abstract:To relieve the burden of security analysts, Android malware detection and its family classification need to be automated. There are many previous works focusing on using machine (or deep) learning technology to tackle these two important issues, but as the number of mobile applications has increased in recent years, developing a scalable and precise solution is a new challenge that needs to be addressed in the security field. Accordingly, in this article, we propose a novel approach that not only enhances the performance of both Android malware and its family classification, but also reduces the running time of the analysis process. Using large-scale datasets obtained from different sources, we demonstrate that our method is able to output a high F-measure of 99.71% with a low FPR of 0.37%. Meanwhile, the computation time for processing a 300K dataset is reduced to nearly 3.3 hours. In addition, in classification evaluation, we demonstrate that the F-measure, precision, and recall are 97.5%, 96.55%, 98.64%, respectively, when classifying 28 malware families. Finally, we compare our method with previous studies in both detection and classification evaluation. We observe that our method produces better performance in terms of its effectiveness and efficiency.
What problem does this paper attempt to address?