Malware Detection Using Contrastive Learning Based on Multi-Feature Fusion

Kailu Guo,Yang Xin,Tianxiang Yu
DOI: https://doi.org/10.1109/trustcom60117.2023.00229
2024-01-01
Abstract:The continuous emergence of malicious software poses a serious threat to computer security. Traditional malware detection approaches rely on single or limited datas for feature extraction, which may not fully capture the effective information provided by different information dimensions in PE files. Furthermore, attackers always employ code obfuscation and evasion techniques to interfere with detection results while still maintaining malicious intent. To address these issues, we designed and implemented a model, called MFFCL. Firstly, our approach uses data feature fusion technology to mine as much effective information as possible from execution records, forming the original dataset. Secondly, to effectively resist code obfuscation, we utilize supervised contrastive learning to constitute software features from high-dimensional space. Experiments on public datasets demonstrate that MFFCL can detect malware with high accuracy and stability. Specifically, during training, MFFCL achieved a precision rate of 97.25% and a recall rate of 94.1%.
What problem does this paper attempt to address?