MSDetector: A Static PHP Webshell Detection System Based on Deep-Learning

Baijun Cheng,Yanhui Guo,Yan Ren,Gang Yang,Guosheng Xu
DOI: https://doi.org/10.1007/978-3-031-10363-6_11
2022-01-01
Abstract:Webshell is a web script containing malicious code fragment, which hackers could use to launch web attacks. Hence, it is of great signifiance to identify whether a web script contains malicious code fragments in the aspect of web security. However, the flexibility of scripting language such as PHP provides attackers the opportunities to obfuscate scripts, making it challenging for traditional rule-based webshell detectors to detect malicious code fragments. Deep learning brings new ideas for webshell detection and improves the effect of detectors. However, the effect of deep learning-based detectors depends on feature engineering and deep learning models. The feature representations and models adopted by existing methods fail to mine the syntactic and semantic features of webshell scripts. To tackle those problems, we design a new code representation called script sequence according to the characteristics of webshell and also we introduce new pretrain task to enhance understanding of deep learning model to syntax information of webshell code. This leads to the design and implementation of Malicious Script Detector (MSDetector). In order to evaluate MSDetector, we present a new PHP webshell dataset. Experimental results prove that MSDetector can achieve higher F1 score and accuracy than other approaches on the dataset.
What problem does this paper attempt to address?