Exploiting Feature Interactions for Malicious Website Detection with Overhead-accuracy Tradeoff

Shuaiqi Shen,Chong Yu,Kuan Zhang,Song Ci
DOI: https://doi.org/10.1109/icc42927.2021.9500731
2021-01-01
Abstract:Malicious websites attempt to install malware on user’s devices without permission, which can disrupt device operation, steal personal information, and even acquire access to the device for future attacks. Accurate detection of malicious website behaviors is crucial for network security but still faces challenges. Firstly, various types and semantics of website features are required to identity the wide range of malicious characteristics, leading to massive training data and computational overhead. Secondly, to reduce model dimensionality, a proper selection of website features is essential but difficult due to the complex relations among features that can affect each other’s contribution to detection outcomes. In this paper, we propose a lightweight feature-based detection scheme against malicious websites considering the interaction measures among features and the overhead-accuracy tradeoff. Specifically, we systematically characterize the interactions among website features in a non-additive manner to indicate the aggregated impacts of feature subsets. Then we propose a quantification method to measure the feature interactions based on multivariate regression. With this method, important features are selected to substantially reduce the model dimension and computational complexity while maintaining desirable accuracy. Meanwhile, the proposed scheme provides an interpretable model that preserves the physical meanings of original features. It allows users to balance the overhead-accuracy tradeoff for detection model training through feature subset selection to fit the requirements and constraints of real applications.
What problem does this paper attempt to address?