Multi-context Features for Detecting Malicious Programs

Moustafa Saleh,Tao Li,Shouhuai Xu
DOI: https://doi.org/10.1007/s11416-017-0304-8
2017-01-01
Journal of Computer Virology and Hacking Techniques
Abstract:Malware detection is still an open problem. There are numerous attacks that take place every day where malware is used to steal private information, disrupt services, or sabotage industrial systems. In this paper, we combine three kinds of contextual information, namely static, dynamic, and instruction-based, for malware detection. This leads to the definition of more than thirty thousand features, which is a large features set that covers a wide range of a sample characteristics. Through experiments with one million files, we show that this features set leads to machine learning based models that can detect both malware seen roughly at the time when the models are built, and malware first seen even months after the models were built (i.e., the detection models remain effective months ahead of time). This may be due to the comprehensiveness of the features set.
What problem does this paper attempt to address?