Malanalyser: An Effective and Efficient Windows Malware Detection Method Based on Api Call Sequences

Prachi Ahlawat,Namita Dabas,Prabha Sharma
DOI: https://doi.org/10.2139/ssrn.4295237
2022-01-01
SSRN Electronic Journal
Abstract:Malware is a serious cyber security threat that is extremely difficult to combat. Over the past few years, the frequency of malware attacks has increased manifold and is expected to grow further in the coming years. Many studies suggest API call sequences generated by a process can serve as an effective tool for malware detection. However, most of the existing works based on API call sequences suffer from high dimensional feature sets. This paper proposes a novel API call sequences based malware detection model, MalAnalyser, to address these issues. To reduce the computational overhead involved with the extensive original call sequences, MalAnalyser extracts frequent API call subsequences (patterns) as features for differentiating malware from benign processes, as they encompass the most representative and valuable behavioural information of a process. The model then uses Global Local Best Particle Swarm Optimization (GLBPSO) for identifying the discriminatory features with an objective of reducing computational overhead of the malware detection algorithm without jeopardizing the detection accuracy. Extensive experimentation performed on benign and malware samples exhibit that the proposed model delivers better performance in comparison to the current state-of-the-art models. Experiments conducted on the complete feature set as well as a set of randomly selected 60% features as input, exhibit a detection accuracy of up to 99.71% on both the sets with up to 50.52% and 74.46% reduction in size of feature set respectively. Another major contribution of this work is detection of rare or unseen malware behavior by generating new malware patterns from the existing patterns using Genetic Algorithm. MalAnalyser reported a significant performance gain with this enriched feature set and attained up to 100% accuracy with up to 70% reduced feature set. Further, effectiveness of MalAnalyser is evaluated for detecting a specific type of malware i.e. ransomware.
What problem does this paper attempt to address?