MetaCluster: a Universal Interpretable Classification Framework for Cybersecurity

Wenhan Ge,Zeyuan Cui,Junfeng Wang,Binhui Tang,Xiaohui Li
DOI: https://doi.org/10.1109/tifs.2024.3372808
IF: 7.231
2024-01-01
IEEE Transactions on Information Forensics and Security
Abstract:Rising cyber threats have created an immediate demand for Deep Learning (DL) in cybersecurity. Nevertheless, the opaque nature of DL models poses challenges in deploying, collaborating, and assessing their effectiveness in less reliable cybersecurity environments. Despite eXplainable Artificial Intelligence (XAI) playing a role in enhancing cybersecurity analytics, the limited task scope, the propensity for data overfitting, and the stochastic explanations hinder its broader application. To fill the gap, this paper introduces a generic interpretable classification framework, named MetaCluster. MetaCluster generates semantic prototypes for features, patterns, and domains at varying granular levels by following three fundamental steps: embedding representations, acquiring prototypes, and aggregating semantics. These mechanisms guarantee that MetaCluster achieves critical information extraction and reliable classification at minimal cost. The experiments encompass cybersecurity classification tasks and assess the interpretability of the framework. These tasks encompass malware family classification, threat behavior analysis, and malicious traffic identification. In particular, when compared to other DL models, MetaCluster exhibits a significant reduction in parameter consumption by 79.52% to 91.78%, and boosts operational speed up to 71.37%, while its F1 scores remain stable or slightly increase. Additionally, MetaCluster possesses the ability to assess and visually represent the significance of image, text, and statistical features. This capability leads to a reduction of Mean Squared Error (MSE) between expected and actual predictions by 0.0101 to 0.1020.
What problem does this paper attempt to address?