Detecting Malicious Websites in Depth through Analyzing Topics and Web-pages
Senhao Wen,Zhiyuan Zhao,Hanbing Yan
DOI: https://doi.org/10.1145/3199478.3199500
2018-03-16
Abstract:We are increasingly relying on HTML(Hypertext Markup Language) web-pages, including online shopping, obtaining information and handling official business, whether on a mobile phone or on a computer. But the underground industry or deliberate attacker has also targeted us. They forges misleading URLs(UniformResourceLocators) and web-pages with various tricky skills which are difficult to identify even for the professionals, so as to steal money, manipulate our devices, monitor our lives. Currently, most of the malicious websites detecting technology is based on features of URLs or Web page elements, which make us feel upset, because it is difficult to extract all the features, and its hysteresis quality make it hard to meet the requirement of tracking the rapid changes of malicious web sites, especially the phishing websites. This paper design a thorough associated analysis model of web-pages using the technology of topic tracking, topic abnormal discovery, web-page visual similarity assessment, web-page structure analyzing, URL analyzing, Internet Resources analyzing and so on, to solve the problem of missing detecting malicious websites, especially with unknown features. The detecting model is able to discover the forged payment web-pages, fake web page which damage the reputation of the relevant organization, personal attack web-pages, tampered web page, web-page trojan, etc. In the meantime, it can track the topic of underground industry and sense the topic evolution. According to our experiments, it is effective and has high accuracy.