Uncovering CWE-CVE-CPE Relations with Threat Knowledge Graphs

Zhenpeng Shi,Nikolay Matyunin,Kalman Graffi,David Starobinski
DOI: https://doi.org/10.1145/3641819
2023-05-01
Abstract:Security assessment relies on public information about products, vulnerabilities, and weaknesses. So far, databases in these categories have rarely been analyzed in combination. Yet, doing so could help predict unreported vulnerabilities and identify common threat patterns. In this paper, we propose a methodology for producing and optimizing a knowledge graph that aggregates knowledge from common threat databases (CVE, CWE, and CPE). We apply the threat knowledge graph to predict associations between threat databases, specifically between products, vulnerabilities, and weaknesses. We evaluate the prediction performance both in closed world with associations from the knowledge graph, and in open world with associations revealed afterward. Using rank-based metrics (i.e., Mean Rank, Mean Reciprocal Rank, and Hits@N scores), we demonstrate the ability of the threat knowledge graph to uncover many associations that are currently unknown but will be revealed in the future, which remains useful over different time periods. We propose approaches to optimize the knowledge graph, and show that they indeed help in further uncovering associations.
Cryptography and Security
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper aims to solve how to predict the correlative relationships among products, vulnerabilities and weaknesses through Threat Knowledge Graphs. Specifically, the paper proposes a methodology for generating and optimizing a knowledge graph that integrates common threat databases (CVE, CWE and CPE). Through this method, unreported vulnerabilities can be predicted and common threat patterns can be identified. ### Background and motivation Security assessment depends on public information about products, vulnerabilities and weaknesses. Currently, these types of databases are rarely comprehensively analyzed. However, doing so can help predict unreported vulnerabilities and identify common threat patterns. For example, the CVE database lists publicly known security vulnerabilities, the CWE database lists common software and hardware weakness types, and the CPE database provides a structured naming scheme for describing common products and technical systems. ### Main contributions 1. **Construction and optimization of Threat Knowledge Graphs**: - Convert the entries in threat databases and their associations into triples in the knowledge graph. - Optimize the knowledge graph to improve its predictive ability, including removing obsolete entries and introducing data from other databases (such as CAPEC entries and CVSS vectors). 2. **Knowledge graph embedding**: - Use knowledge graph embedding techniques to embed the knowledge graph into a vector space for link prediction. - Compare multiple embedding models (such as TransE, DistMult and ComplEx) and show that the TransE model performs best in the task. 3. **Evaluation of predictive ability**: - Evaluate the predictive performance of the embedded knowledge graph in the closed - world and open - world, using standard ranking metrics (such as mean rank, mean reciprocal rank and Hits@N scores). - Verify the predictive ability of the model through historical data and show the predictive performance of the model at different time points. 4. **Further optimization**: - Research multiple methods to further improve the predictive ability of the knowledge graph, including removing old CPE and CVE entries, adding CAPEC database entries, and adding CVSS vectors. ### Conclusion By constructing and optimizing the Threat Knowledge Graph, the paper shows an effective method to predict the correlative relationships among products, vulnerabilities and weaknesses. This not only helps to timely identify and mitigate potential security threats, but also provides strong support for future security assessments.