Abstract:Many existing Machine Learning (ML) based Android malware detection approaches use a variety of features such as security-sensitive APIs, system calls, control-flow structures and information flows in conjunction with ML classifiers to achieve accurate detection. Each of these feature sets provides a unique semantic perspective (or view) of apps’ behaviors with inherent strengths and limitations. Meaning, some views are more amenable to detect certain attacks but may not be suitable to characterize several other attacks. Most of the existing malware detection approaches use only one (or a selected few) of the aforementioned feature sets which prevents them from detecting a vast majority of attacks. Addressing this limitation, we propose MKLDroid, a unified framework that systematically integrates multiple views of apps for performing comprehensive malware detection and malicious code localization. The rationale is that, while a malware app can disguise itself in some views, disguising in every view while maintaining malicious intent will be much harder. MKLDroid uses a graph kernel to capture structural and contextual information from apps’ dependency graphs and identify malice code patterns in each view. Subsequently, it employs Multiple Kernel Learning (MKL) to find a weighted combination of the views which yields the best detection accuracy. Besides multi-view learning, MKLDroid’s unique and salient trait is its ability to locate fine-grained malice code portions in dependency graphs (e.g., methods/classes). Malicious code localization caters several important applications such as supporting human analysts studying malware behaviors, engineering malware signatures, and other counter-measures. Through our large-scale experiments on several datasets (incl. wild apps), we demonstrate that MKLDroid outperforms three state-of-the-art techniques consistently, in terms of accuracy while maintaining comparable efficiency. In our malicious code localization experiments on a dataset of repackaged malware, MKLDroid was able to identify all the malice classes with 94% average recall. Our work opens up two new avenues in malware research: (i) enables the research community to elegantly look at Android malware behaviors in multiple perspectives simultaneously, and (ii) performing precise and scalable malicious code localization.

Active learning framework for android unknown malware detection

LSTM Android Malicious Behavior Analysis Based on Feature Weighting

ActDroid: An active learning framework for Android malware detection

Android Malware Detection Based on a Hybrid Deep Learning Model

DL-Droid: Deep learning based android malware detection using real devices

Deep learning guided Android malware and anomaly detection

AndroCreme - Unseen Android Malware Detection Based on Inductive Conformal Learning.

Adaptive and Scalable Android Malware Detection through Online Learning

An Android Malware Detection Method Using Deep Learning Based on API Calls

A Deep Learning Based Android Malware Detection System with Static Analysis

A Hybrid Deep Network Framework for Android Malware Detection

A Hybrid Analysis-Based Approach to Android Malware Family Classification

Droidetec: Android Malware Detection and Malicious Code Localization through Deep Learning

Semi-supervised classification for dynamic Android malware detection

Dynamic detection of mobile malware using smartphone data and machine learning

Droiddetector: Android Malware Characterization and Detection Using Deep Learning

Droid-Sec: Deep Learning In Android Malware Detection

A machine learning approach to anomaly-based detection on Android platforms

A multi-view context-aware approach to Android malware detection and malicious code localization

Multi-label Classification for Android Malware Based on Active Learning

A Two-Layer Deep Learning Method for Android Malware Detection Using Network Traffic.