Abstract:Attack knowledge graph construction seeks to convert textual cyber threat intelligence (CTI) reports into structured representations, portraying the evolutionary traces of cyber attacks. Even though previous research has proposed various methods to construct attack knowledge graphs, they generally suffer from limited generalization capability to diverse knowledge types as well as requirement of expertise in model design and tuning. Addressing these limitations, we seek to utilize Large Language Models (LLMs), which have achieved enormous success in a broad range of tasks given exceptional capabilities in both language understanding and zero-shot task fulfillment. Thus, we propose a fully automatic LLM-based framework to construct attack knowledge graphs named: AttacKG+. Our framework consists of four consecutive modules: rewriter, parser, identifier, and summarizer, each of which is implemented by instruction prompting and in-context learning empowered by LLMs. Furthermore, we upgrade the existing attack knowledge schema and propose a comprehensive version. We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation, including behavior graph, MITRE TTP labels, and state summary. Extensive evaluation demonstrates that: 1) our formulation seamlessly satisfies the information needs in threat event analysis, 2) our construction framework is effective in faithfully and accurately extracting the information defined by AttacKG+, and 3) our attack graph directly benefits downstream security practices such as attack reconstruction. All the code and datasets will be released upon acceptance.

What problem does this paper attempt to address?

### The Problem Addressed by This Paper This paper aims to address the problem of constructing attack knowledge graphs in Cyber Threat Intelligence (CTI) reports. Specifically: 1. **Limitations of Existing Methods**: - **Limited Semantic Understanding**: Existing models have limitations in handling diverse attack scenarios and types of knowledge. They often have small training datasets and relatively small model sizes, making it difficult to cope with various types of security knowledge in open scenarios. - **Strong Dependency on Model Design**: Current methods require specially designed natural language processing or graph matching models and a significant amount of human effort for fine-tuning. This poses a challenge for security technicians who lack relevant background knowledge. 2. **Leveraging the Advantages of Large Language Models (LLMs)**: - Large language models use large-scale open knowledge data during pre-training, possessing strong contextual understanding and knowledge reasoning capabilities, and can understand various types of knowledge across different domains. - LLMs can perform zero-shot and few-shot tasks through instruction following and contextual learning without the need for special model structure design or specific dataset training. Therefore, using LLMs to construct attack knowledge graphs can effectively address the above two limitations. 3. **Proposed New Framework AttacKG+**: - This paper proposes a fully automated LLM-based framework called AttacKG+, which includes four modules: Rewriter, Parser, Identifier, and Summarizer. - The framework converts CTI reports into structured attack knowledge graphs through steps of rewriting, parsing, identifying, and summarizing, with each module leveraging the capabilities of LLMs. Through the above methods, the paper aims to improve the accuracy and generalization ability of constructing attack knowledge graphs while simplifying the model design process, making it more user-friendly.

AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

AttacKG+: Boosting Attack Graph Construction with Large Language Models

LLM-TIKG: Threat intelligence knowledge graph construction utilizing large language model

MultiKG: Multi-Source Threat Intelligence Aggregation for High-Quality Knowledge Graph Representation of Attack Techniques

Using Retriever Augmented Large Language Models for Attack Graph Generation

AttacKG: Constructing Technique Knowledge Graph from Cyber Threat Intelligence Reports

KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment

Learning on Graphs with Large Language Models(LLMs): A Deep Dive into Model Robustness

Actionable Cyber Threat Intelligence using Knowledge Graphs and Large Language Models

APTKG: Constructing Threat Intelligence Knowledge Graph from Open-Source APT Reports Based on Deep Learning

CSKG4APT: A Cybersecurity Knowledge Graph for Advanced Persistent Threat Organization Attribution

K-CTIAA: Automatic Analysis of Cyber Threat Intelligence Based on a Knowledge Graph

AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks

Cyber Knowledge Completion Using Large Language Models

SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graphs

GraphAttacker: A General Multi-Task GraphAttack Framework

Target-driven Attack for Large Language Models

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

Large Language Models Merging for Enhancing the Link Stealing Attack on Graph Neural Networks

Attack Prompt Generation for Red Teaming and Defending Large Language Models

On the Uses of Large Language Models to Interpret Ambiguous Cyberattack Descriptions