Abstract:The continuous emergence of malware has threatened to the Android platform and user privacy. With the evolution of the Android system and malware, it is challenging to design a method that can accurately identify the categories of sophisticated malware, including known and unknown families, as well as their obfuscated variants, given that they may be newly emerging and lack available detection knowledge. Although some methods try to use anomaly detection and zero-shot technology to identify unseen applications, they are limited to binary classification or lack the robustness, stability, universality, and interpretability in multi-class identification. To this end, we first propose a generic meta-features mining algorithm, which can discover the potential relationships between samples belonging to the same category. Then we present metaNet, a novel method leveraging meta-features to identify sophisticated Android malware. Specifically, metaNet is mainly powered by four components: (i) mExtractor is a feature collector to obtain the static and dynamic features. (ii) mProcessor is taking unique meta-features of each category from extracted features. (iii) mLearner is a machine learning suite that leverages features and meta-features to design and train a classifier called HSU-Net. (iv) mEnforcer is a flexible deployer that identifies categories of malware families in the real world. We implement a prototype of metaNet with 15K lines of Python code and compare it with state-of-the-art (SOTA) methods. The results show that it can not only achieve superior performance in terms of known families (99.52% of accuracy) and unknown families (99.31% of accuracy trained with 80% known families) for binary classification, but also perform well in multi-class identification, i.e., 99.05% and 93.45% of accuracy for known and unknown families, respectively. Furthermore, we deploy and evaluate metaNet in the real world. It can identify applications over an acceptable time and memory overheads, i.e., average of 11.8s and 56MB per sample with a size of 8MB. Also, the few-shot detection and feature perturbation experiments reflect its robustness and stability benefiting from meta-features. Finally, we collect the traffic of 112 decentralized applications (DApps) belonging to 16 categories, such as finance and health, and evaluated metaNet in DApp identification. The results illustrate its applicability across various tasks. That is, it can accurately classify 94.6% and 81.36% of DApp flows in all-known and 80%-known DApp scenarios, respectively, outperforming the SOTA methods.

Evaluating Grayware Characteristics and Risks

Malware Characteristics and Threats on the Internet Ecosystem

A Study of Grayware on Google Play

Lifting the Grey Curtain: Analyzing the Ecosystem of Android Scam Apps

Lifting The Grey Curtain: A First Look at the Ecosystem of CULPRITWARE

A Categorization Framework for Common Computer Vulnerabilities and Exposures

Gray-Box Shilling Attack: An Adversarial Learning Approach

Metanet: Interpretable Unknown Mobile Malware Identification with a Novel Meta-Features Mining Algorithm

A malware detection framework based on kolmogorov complexity

Automated Network Incident Identification through Genetic Algorithm-Driven Feature Selection

Explainability-Informed Targeted Malware Misclassification

IGO_CM: An Improved Grey-Wolf Optimization Based Classification Model for Cyber Crime Data Analysis Using Machine Learning

Adversarial attacks against Windows PE malware detection: A survey of the state-of-the-art

A Framework for Detection of Cyber Attacks by the Classification of Intrusion Detection Datasets

Enhancing robustness of community structure in networks against attacks with gray information

A Malware Classification Survey on Adversarial Attacks and Defences

A Wolf in Sheep's Clothing: Practical Black-box Adversarial Attacks for Evading Learning-based Windows Malware Detection in the Wild

Efficient Windows malware identification and classification scheme for plant protection information systems

SpecView: Malware Spectrum Visualization Framework With Singular Spectrum Transformation

A Malware Classification Method Based on the Capsule Network

A Survey of Machine Learning Methods and Challenges for Windows Malware Classification