Discovering optimal features using static analysis and a genetic search based method for Android malware detection
Ahmad Firdaus,Nor Badrul Anuar,Ahmad Karim,Mohd Faizal Ab Razak
DOI: https://doi.org/10.1631/FITEE.1601491
2019-10-03
Abstract:Abstract Mobile device manufacturers are rapidly producing miscellaneous Android versions worldwide. Simultaneously, cyber criminals are executing malicious actions, such as tracking user activities, stealing personal data, and committing bank fraud. These criminals gain numerous benefits as too many people use Android for their daily routines, including important communi-cations. With this in mind, security practitioners have conducted static and dynamic analyses to identify malware. This study used static analysis because of its overall code coverage, low resource consumption, and rapid processing. However, static analysis requires a minimum number of features to efficiently classify malware. Therefore, we used genetic search (GS), which is a search based on a genetic algorithm (GA), to select the features among 106 strings. To evaluate the best features determined by GS, we used five machine learning classifiers, namely, Naïve Bayes (NB), functional trees (FT), J48, random forest (RF), and multilayer perceptron (MLP). Among these classifiers, FT gave the highest accuracy (95%) and true positive rate (TPR) (96.7%) with the use of only six features.
engineering, electrical & electronic,computer science, information systems, software engineering