Anomaly prediction of Internet behavior based on generative adversarial networks

XiuQing Wang,Yang An,Qianwei Hu
DOI: https://doi.org/10.7717/peerj-cs.2009
2024-07-23
Abstract:With the popularity of Internet applications, a large amount of Internet behavior log data is generated. Abnormal behaviors of corporate employees may lead to internet security issues and data leakage incidents. To ensure the safety of information systems, it is important to research on anomaly prediction of Internet behaviors. Due to the high cost of labeling big data manually, an unsupervised generative model-Anomaly Prediction of Internet behavior based on Generative Adversarial Networks (APIBGAN), which works only with a small amount of labeled data, is proposed to predict anomalies of Internet behaviors. After the input Internet behavior data is preprocessed by the proposed method, the data-generating generative adversarial network (DGGAN) in APIBGAN learns the distribution of real Internet behavior data by leveraging neural networks' powerful feature extraction from the data to generate Internet behavior data with random noise. The APIBGAN utilizes these labeled generated data as a benchmark to complete the distance-based anomaly prediction. Three categories of Internet behavior sampling data from corporate employees are employed to train APIBGAN: (1) Online behavior data of an individual in a department. (2) Online behavior data of multiple employees in the same department. (3) Online behavior data of multiple employees in different departments. The prediction scores of the three categories of Internet behavior data are 87.23%, 85.13%, and 83.47%, respectively, and are above the highest score of 81.35% which is obtained by the comparison method based on Isolation Forests in the CCF Big Data & Computing Intelligence Contest (CCF-BDCI). The experimental results validate that APIBGAN predicts the outlier of Internet behaviors effectively through the GAN, which is composed of a simple three-layer fully connected neural networks (FNNs). We can use APIBGAN not only for anomaly prediction of Internet behaviors but also for anomaly prediction in many other applications, which have big data infeasible to label manually. Above all, APIBGAN has broad application prospects for anomaly prediction, and our work also provides valuable input for anomaly prediction-based GAN.
What problem does this paper attempt to address?