Abstract:The continuous emergence of malware has threatened to the Android platform and user privacy. With the evolution of the Android system and malware, it is challenging to design a method that can accurately identify the categories of sophisticated malware, including known and unknown families, as well as their obfuscated variants, given that they may be newly emerging and lack available detection knowledge. Although some methods try to use anomaly detection and zero-shot technology to identify unseen applications, they are limited to binary classification or lack the robustness, stability, universality, and interpretability in multi-class identification. To this end, we first propose a generic meta-features mining algorithm, which can discover the potential relationships between samples belonging to the same category. Then we present metaNet, a novel method leveraging meta-features to identify sophisticated Android malware. Specifically, metaNet is mainly powered by four components: (i) mExtractor is a feature collector to obtain the static and dynamic features. (ii) mProcessor is taking unique meta-features of each category from extracted features. (iii) mLearner is a machine learning suite that leverages features and meta-features to design and train a classifier called HSU-Net. (iv) mEnforcer is a flexible deployer that identifies categories of malware families in the real world. We implement a prototype of metaNet with 15K lines of Python code and compare it with state-of-the-art (SOTA) methods. The results show that it can not only achieve superior performance in terms of known families (99.52% of accuracy) and unknown families (99.31% of accuracy trained with 80% known families) for binary classification, but also perform well in multi-class identification, i.e., 99.05% and 93.45% of accuracy for known and unknown families, respectively. Furthermore, we deploy and evaluate metaNet in the real world. It can identify applications over an acceptable time and memory overheads, i.e., average of 11.8s and 56MB per sample with a size of 8MB. Also, the few-shot detection and feature perturbation experiments reflect its robustness and stability benefiting from meta-features. Finally, we collect the traffic of 112 decentralized applications (DApps) belonging to 16 categories, such as finance and health, and evaluated metaNet in DApp identification. The results illustrate its applicability across various tasks. That is, it can accurately classify 94.6% and 81.36% of DApp flows in all-known and 80%-known DApp scenarios, respectively, outperforming the SOTA methods.

You Are What You Do: Hunting Stealthy Malware Via Data Provenance Analysis

Malton: Towards On-Device Non-Invasive Mobile Malware Analysis for ART.

Detecting Malicious Websites from the Perspective of System Provenance Analysis

Prov2vec: Learning Provenance Graph Representation for Unsupervised APT Detection

Metanet: Interpretable Unknown Mobile Malware Identification with a Novel Meta-Features Mining Algorithm

Malware Analysis Using Machine Learning and Deep Learning Techniques

TBDetector:Transformer-Based Detector for Advanced Persistent Threats with Provenance Graph

APT-KGL: an Intelligent APT Detection System Based on Threat Knowledge and Heterogeneous Provenance Graph Learning

A Wolf in Sheep's Clothing: Practical Black-box Adversarial Attacks for Evading Learning-based Windows Malware Detection in the Wild

PORTFILER: Port-Level Network Profiling for Self-Propagating Malware Detection

Quo Vadis: Hybrid Machine Learning Meta-Model based on Contextual and Behavioral Malware Representations

Android HIV: A Study of Repackaging Malware for Evading Machine-Learning Detection

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Malware Sight-Seeing: Accelerating Reverse-Engineering via Point-of-Interest-Beacons

In-execution dynamic malware analysis and detection by mining information in process control blocks of Linux OS

Discovering Malicious Signatures in Software from Structural Interactions

Interpretable Detection of Malicious Behavior in Windows Portable Executables Using Multi-Head 2D Transformers

Provenance-based Intrusion Detection: Opportunities and Challenges

threaTrace: Detecting and Tracing Host-based Threats in Node Level Through Provenance Graph Learning

Automated Poisoning Attacks and Defenses in Malware Detection Systems: An Adversarial Machine Learning Approach