Learning URL Embedding for Malicious Website Detection

Xiaodan Yan,Yang Xu,Baojiang Cui,Shuhan Zhang,Taibiao Guo,Chaoliang Li
DOI: https://doi.org/10.1109/tii.2020.2977886
IF: 12.3
2020-01-01
IEEE Transactions on Industrial Informatics
Abstract:The emergence of artificial intelligence technology has promoted the development of the Internet of Things. However, this promising cyber technology can encounter serious security problems while accessing the internet. A malicious website can disguise itself as a normal website, and obtain users’ private information. Thus, it is very important to detect malicious websites using tools such as machine learning (ML) algorithms, as these algorithms can help us to identify abnormal information hidden in the mass traffic more easily. Accordingly, many feature engineering tasks must be performed from memory, as a strong machine learning model is greatly improved with good features. In this article, we propose an unsupervised learning algorithm that learns URL embedding. We also explore some key parameters regarding a domain embedding model to obtain a good effect on domain features.
What problem does this paper attempt to address?