Real-time Detection of Malicious Web Pages Based on Statistical Learning

WANG Tao,YU Shun-zheng
DOI: https://doi.org/10.3969/j.issn.1002-137x.2011.01.019
2011-01-01
Computer Science
Abstract:Malicious Web pages impose increasing threats on Web security in recent years.Currently,there are mainly two client-side protection approaches including anti-virus software packages and blacklists of malicious sites.Anti-virus techniques commonly use signature-based approaches which might not be able to efficiently identify malicious HTML codes with encryption and obfuscation.Furthermore,blacklisting techniques are difficult to keep up-to-date.This paper presented a novel classification method for real-time detecting malicious Web pages which is independent with the contents of Web pages.Our approach characterizes malicious Web pages using HTTP session information.With representative statistical features and decision tree algorithm in machine learning,we built an effective classification model for online real-time detecting malicious Web pages.Experiment results demonstrate that we are able to successfully detect 89.7% of the malicious Web pages with a low false positive rate of 0.3%.
What problem does this paper attempt to address?