Novel Security Metrics for Identifying Risky Unified Resource Locators (URLs)

Mahmood Deypir,Toktam Zoughi
DOI: https://doi.org/10.1007/s40998-023-00690-x
2024-05-12
Iranian Journal of Science and Technology Transactions of Electrical Engineering
Abstract:Attackers perform malicious activities by sending URL s to victims via e-mail, SMS , social network messages, and other means. Recently, intruders have been generating malicious URL s algorithmically. They also use shortening or obfuscation services to bypass firewalls and other security barriers. Some machine learning methods have been presented in order to identify malicious URLs from normal ones, all of which are subject to classification errors. On the other hand, it is impractical to have a complete and up-to-date blacklist due to large number of daily generated malicious URL s. Therefore, calculating the URLs security risk would be more helpful than URLs classification. In this way a user can correctly decide whether to use an unfamiliar URL if they know its associated security risk. In this study, the problem of URLs security risk computation is introduced and two effective novel criteria for this problem are proposed. Based on these criteria, a security risk score can be estimated for each incoming URL . In the first criterion, based on previous malicious and non-malicious URL instances, the extracted features of a URL are divided into two categories, those increase the risk and those reduce the security risk. In the second criterion, security risk score of an unknown URL is estimated based on its distances to nearest known malicious and also safe URLs . For both criterion, corresponding formulations and algorithms are also designed and are described. Extensive empirical evaluations on various real datasets show the effectiveness of the proposed criteria in terms of malicious URL detection rate. Moreover, our experiments show that the proposed metrics significantly outperforms previously proposed risk score criteria.
engineering, electrical & electronic
What problem does this paper attempt to address?