A Framework for Predicting Data Breach Risk: Leveraging Dependence to Cope with Sparsity

Zijian Fang,Maochao Xu,Shouhuai Xu,Taizhong Hu
DOI: https://doi.org/10.1109/tifs.2021.3051804
IF: 7.231
2021-01-01
IEEE Transactions on Information Forensics and Security
Abstract:Data breach is a major cybersecurity problem that has caused huge financial losses and compromised many individuals' privacy (e.g., social security numbers). This calls for deeper understanding about the data breach risk. Despite the substantial amount of attention that has been directed toward the issue, many fundamental problems are yet to be investigated. In this article, we initiate the study of modeling and predicting risk in enterprise-level data breaches. This problem is challenging because of the sparsity of breaches experienced by individual enterprises over time, which immediately disqualifies standard statistical models because there are not enough data to train such models. As a first step towards tackling the problem, we propose an innovative statistical framework to leverage the dependence between multiple time series. In order to validate the framework, we apply it to a dataset of enterprise-level breach incidents. Experimental results show its effectiveness in modeling and predicting enterprise-level breach incidents.
What problem does this paper attempt to address?