Dynamic Malware Analysis Based on API Sequence Semantic Fusion

Sanfeng Zhang,Jiahao Wu,Mengzhe Zhang,Wang Yang
DOI: https://doi.org/10.3390/app13116526
2023-05-26
Applied Sciences
Abstract:The existing dynamic malware detection methods based on API call sequences ignore the semantic information of functions. Simply mapping API to numerical values does not reflect whether a function has performed a query or modification operation, whether it is related to network communication, the file system, or other factors. Additionally, the detection performance is limited when the size of the API call sequence is too large. To address this issue, we propose Mal-ASSF, a novel malware detection model that fuses the semantic and sequence features of the API calls. The API2Vec embedding method is used to obtain the dimensionality reduction representation of the API function. To capture the behavioral features of sequential segments, Balts is used to extract the features. To leverage the implicit semantic information of the API functions, the operation and the type of resource operated by the API functions are extracted. These semantic and sequential features are then fused and processed by the attention-related modules. In comparison with the existing methods, Mal-ASSF boasts superior capabilities in terms of semantic representation and recognition of critical sequences within API call sequences. According to the evaluation with a dataset of malware families, the experimental results show that Mal-ASSF outperforms existing solutions by 3% to 5% in detection accuracy.
materials science, multidisciplinary,engineering,chemistry,physics, applied
What problem does this paper attempt to address?