Abstract:Security assessment relies on public information about products, vulnerabilities, and weaknesses. So far, databases in these categories have rarely been analyzed in combination. Yet, doing so could help predict unreported vulnerabilities and identify common threat patterns. In this article, we propose a methodology for producing and optimizing a knowledge graph that aggregates knowledge from common threat databases (CVE, CWE, and CPE). We apply the threat knowledge graph to predict associations between threat databases, specifically between products, vulnerabilities, and weaknesses. We evaluate the prediction performance both in closed world with associations from the knowledge graph and in open world with associations revealed afterward. Using rank-based metrics (i.e., Mean Rank, Mean Reciprocal Rank, and Hits@N scores), we demonstrate the ability of the threat knowledge graph to uncover many associations that are currently unknown but will be revealed in the future, which remains useful over different time periods. We propose approaches to optimize the knowledge graph and show that they indeed help in further uncovering associations. We have made the artifacts of our work publicly available.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: There is insufficient correlation analysis among existing threat databases (such as CVE, CWE, and CPE), making it difficult to predict unreported vulnerabilities and identify common threat patterns. Specifically, the author points out that although these databases provide public information about products, vulnerabilities, and weaknesses, this information is usually isolated and rarely comprehensively analyzed. Therefore, the author proposes a method based on the threat knowledge graph, aiming to integrate and optimize knowledge from different threat databases to predict hidden correlations, especially the correlations among products, vulnerabilities, and weaknesses. ### Specific description of the problem 1. **Limitations of existing threat databases**: - Existing threat databases (such as CVE, CWE, and CPE) provide a large amount of public information, but this information is often isolated and lacks effective comprehensive analysis. - The correlations between these databases are usually analyzed manually, which is time - consuming and prone to missing important correlation information. 2. **The need to predict unreported vulnerabilities and identify common threat patterns**: - Predicting unreported vulnerabilities can help security assessors discover potential security risks in advance, so as to take preventive measures. - Identifying common threat patterns helps to better understand the nature of security threats and then formulate more effective protection strategies. ### Solutions proposed in the paper To address the above problems, the author proposes the following solutions: 1. **Constructing a threat knowledge graph**: - Convert the entries in the CVE, CWE, and CPE databases and their correlations into triples (entity - relationship - entity) in the knowledge graph to form a unified knowledge representation. - Map these triples into vector space through knowledge graph embedding techniques for link prediction. 2. **Optimizing the knowledge graph**: - Reduce redundant information by merging CPE entries with the same properties (except for the version number). - Remove CPE and CVE entries without correlation information to improve the quality of the knowledge graph. 3. **Predicting hidden correlation relationships**: - Use machine learning methods (such as embedding models such as TransE, DistMult, and ComplEx) to train the knowledge graph and predict currently unknown but potentially revealed correlation relationships in the future. - Verify the prediction performance of the model through closed - world and open - world evaluation experiments. ### Main contributions 1. **Proposing and implementing the concept of the threat knowledge graph**: - Integrate the entries in different threat databases and their correlations into a unified knowledge graph. - Optimize the knowledge graph to improve its prediction ability. 2. **Applying knowledge graph embedding techniques for link prediction**: - Compare multiple embedding models (such as TransE, DistMult, and ComplEx) and show the superior performance of the TransE model in the task. 3. **Evaluating the prediction ability of the threat knowledge graph in different scenarios**: - Conduct extensive evaluations in closed - world and open - world settings to verify the effectiveness of the model. - Show the prediction ability of the model in different time periods. 4. **Exploring methods to further optimize the knowledge graph**: - Further improve the prediction ability by removing obsolete entries and introducing data from other databases (such as CAPEC entries and CVSS vectors). In conclusion, this paper provides an effective method to predict unreported vulnerabilities and identify common threat patterns by constructing and optimizing the threat knowledge graph, thereby enhancing the effectiveness of security assessment.

Uncovering CWE-CVE-CPE Relations with Threat Knowledge Graphs

Uncovering CWE-CVE-CPE Relations with Threat Knowledge Graphs

Fine-grained Commit-level Vulnerability Type Prediction by CWE Tree Structure.

Unveiling Hidden Links Between Unseen Security Entities

Using Program Knowledge Graph to Uncover Software Vulnerabilities

Recent Progress of Using Knowledge Graph for Cybersecurity

Automated CVE Analysis for Threat Prioritization and Impact Prediction

A Relevance Model for Threat-Centric Ranking of Cybersecurity Vulnerabilities

Linking Threat Tactics, Techniques, and Patterns with Defensive Weaknesses, Vulnerabilities and Affected Platform Configurations for Cyber Hunting

Knowledge graph reasoning for cyber attack detection

Cybersecurity Threat Hunting and Vulnerability Analysis Using a Neo4j Graph Database of Open Source Intelligence

SecTKG: A Knowledge Graph for Open-Source Security Tools.

CSKG4APT: A Cybersecurity Knowledge Graph for Advanced Persistent Threat Organization Attribution

Predicting Entity Relations across Different Security Databases by Using Graph Attention Network

ThreatZoom: CVE2CWE using Hierarchical Neural Network

K-CTIAA: Automatic Analysis of Cyber Threat Intelligence Based on a Knowledge Graph

Linking Common Vulnerabilities and Exposures to the MITRE ATT&CK Framework: A Self-Distillation Approach

A System for Automated Open-Source Threat Intelligence Gathering and Management

Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database

V2W-BERT: A Framework for Effective Hierarchical Multiclass Classification of Software Vulnerabilities

A management knowledge graph approach for critical infrastructure protection: Ontology design, information extraction and relation prediction