IGXSS: XSS payload detection model based on inductive GCN
Qiuhua Wang,Chuangchuang Li,Dong Wang,Lifeng Yuan,Gaoning Pan,Yanyu Cheng,Mingde Hu,Yizhi Ren
DOI: https://doi.org/10.1002/nem.2264
2024-02-13
International Journal of Network Management
Abstract:We propose IGXSS, an XSS payload detection model based on inductive graph neural network for detecting XSS attacks targeting IoT devices. By transforming the XSS payload detection task into a node classification task, we leverage the strength of inductive graph neural network in learning sample features, enabling IGXSS to achieve a F1 score of 0.846 despite unbalanced sample distribution and without relying on external resources. To facilitate the management, Internet of Things (IoT) vendors usually apply remote ways such as HTTP services to uniformly manage IoT devices, leading to traditional web application vulnerabilities that also endanger the cloud interfaces of IoT, such as cross‐site scripting (XSS), code injection, and Remote Command/Code Execute (RCE). XSS is one of the most common web application attacks, which allows the attacker to obtain private user information or attack IoT devices and IoT cloud platforms. Most of the existing XSS payload detection models are based on machine learning or deep learning, which usually require a lot of external resources, such as pretrained word vectors, to achieve a better performance on unknown samples. But in the field of XSS payload detection, high‐quality vector representations of samples are often difficult to obtain. In addition, existing models all perform substantially worse when the distribution of XSS payloads and benign samples in the test dataset is extremely unbalanced (e.g., XSS payloads: benign samples = 1: 20). While in the real XSS attack scenario against IoT, an XSS payload is often hidden in a massive amount of normal user requests, indicating that these models are not practical. In response to the above issues, we propose an XSS payload detection model based on inductive graph neural networks, IGXSS (XSS payload detection model based on inductive GCN), to detect XSS payloads targeting IoT. Firstly, we treat the samples and words obtained from segmenting the samples as nodes and attach lines between them in order to form a graph. Then, we obtain the feature matrix of nodes and edges utilizing information between nodes only (instead of external resources such as pretrained word vectors). Finally, we feed the obtained feature matrix into a two‐layer GCN for training and validate the performance of models in several datasets with different sample distributions. Extensive experiments on the real datasets show that IGXSS performs better compared to other models under various sample distributions. In particular, when the sample distribution is extremely unbalanced, the recall and F1 score of IGXSS still reach 1.000 and 0.846, demonstrating that IGXSS is more robust and more suitable for practical scenarios.
computer science, information systems,telecommunications