TSMWD: A High-Speed Malicious Web Page Detection System Based on Two-Step Classifiers

Zhengqi Wang,Xiaobing Feng,Yukun Niu,Chi Zhang,Jue Su
DOI: https://doi.org/10.1109/nana.2017.58
2017-01-01
Abstract:Nowadays, with the rapid development of the Internet and the growing network business, the amount of the websites is experiencing an explosive growth and the malicious websites threaten is becoming the biggest threaten in the Internet age. As a result, there have been broad interests in developing systems to detect the malicious web pages before the end users visits those websites. In order to make the detection more and more accurate, the analysis process becomes more and more costly, sometimes even requiring several seconds for a single page. Therefore, performing this analysis on a large set of web pages containing hundreds of millions of samples can be impossible. Our paper proposes a two-step detection system based on machine learning algorithm called TSMWD. The first step of TSMWD provides a fast and reliable large-scale detection on the countless unknown web pages. It can quickly discard the vast majority of benign web pages by using the static analysis techniques based on the modified Naïve-Bayesian classifier and forward the unknown samples to the second-step which can make a quite precise final detection. Due to the 99% web pages in the Internet are benign ones, the system's detection speed can be improved greatly.
What problem does this paper attempt to address?