Impact of Code Deobfuscation and Feature Interaction in Android Malware Detection

Yun-Chung Chen,Hong-Yen Chen,Takeshi Takahashi,Bo Sun,Tsung-Nan Lin
DOI: https://doi.org/10.1109/access.2021.3110408
IF: 3.9
2021-01-01
IEEE Access
Abstract:With more than three million applications already in the Android marketplace, various malware detection systems based on machine learning have been proposed to prevent attacks from cybercriminals; most of these systems use static analyses to extract application features. However, many features generated by static analyses can be easily thwarted by obfuscation techniques. Therefore, several researchers have addressed this obfuscation problem with obfuscation-invariant features. However, to the best of our knowledge, no researcher has utilized deobfuscation techniques. To this end, we adopt a code deobfuscation technique with an Android malware detection system and investigate its effects. Experimental results indicate that code deobfuscation can successfully retrieve useful information concealed by obfuscation. Further, we propose interaction terms based on identified feature interactions. The proposed interaction terms aim to eliminate the interference caused by the size of the application and other features because many feature values are correlated to the size of the application. In addition, the experimental results indicate that these interaction terms have a high ranking in terms of feature importance values. Our proposed Android malware detection model achieves 99.55% accuracy and a 94.61% F1-score with the well-known Drebin dataset, which is better than the performance of previous works.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?