A novel deep framework for dynamic malware detection based on API sequence intrinsic features

Ce Li,Qiujian Lv,Ning Li,Yan Wang,Degang Sun,Yuanyuan Qiao
DOI: https://doi.org/10.1016/j.cose.2022.102686
2022-05-01
Abstract:Dynamic malware detection executes the software in a secured virtual environment and monitors its run-time behavior. This technique widely uses API sequence analysis to identify whether the running software is malicious or not. However, existing solutions typically only consider the API name or frequency of API usage, and the feature mining of API sequence is not sufficient, which leads some malware to escape from being detected. In this paper, we propose a novel malware detection framework using deep learning models to capture and combine more meaningful features which are called intrinsic features of the API sequence. Specifically, we first apply embedding and convolutional layers to conduct a joint representation of multiple APIs to represent the software behavior. Secondly, we use the category, action, and operation object of the API to represent the semantic information of each API call. Finally, we use the Bi-LSTM module to mine the relationship information between APIs. Our proposed model achieves an accuracy of 0.9731 and an F1-score of 0.9724 on a large real dataset, which outperforms baselines significantly. We also conduct ablation studies to prove the effectiveness of our intrinsic features.
computer science, information systems
What problem does this paper attempt to address?