Cybersecurity Defenses: Exploration of CVE Types through Attack Descriptions

Refat Othman,Bruno Rossi,Barbara Russo
2024-07-11
Abstract:Vulnerabilities in software security can remain undiscovered even after being exploited. Linking attacks to vulnerabilities helps experts identify and respond promptly to the incident. This paper introduces VULDAT, a classification tool using a sentence transformer MPNET to identify system vulnerabilities from attack descriptions. Our model was applied to 100 attack techniques from the ATT&CK repository and 685 issues from the CVE repository. Then, we compare the performance of VULDAT against the other eight state-of-the-art classifiers based on sentence transformers. Our findings indicate that our model achieves the best performance with F1 score of 0.85, Precision of 0.86, and Recall of 0.83. Furthermore, we found 56% of CVE reports vulnerabilities associated with an attack were identified by VULDAT, and 61% of identified vulnerabilities were in the CVE repository.
Cryptography and Security,Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to automatically correlate attack techniques (AT) with known software vulnerabilities (Common Vulnerabilities and Exposures, CVE). Specifically, the authors developed a tool named VULDAT, which aims to automatically identify and link to relevant CVE reports by analyzing attack description texts. The importance of this problem lies in: 1. **Improving response speed**: By automatically correlating attack techniques and CVE reports, security experts can identify and respond to security incidents more quickly. 2. **Enhancing efficiency**: Manually correlating a large number of attack techniques and CVE reports is a cumbersome and error - prone task, and an automated tool can significantly improve work efficiency. 3. **Strengthening security**: Accurate correlation can help organizations better understand potential security threats and take effective defensive measures. ### Research Background As the frequency and complexity of cyber - attacks continue to increase, cyber - security threat intelligence (Cyber Threat Intelligence, CTI) becomes increasingly important. The MITRE Corporation has created multiple resource libraries, such as ATT&CK, CAPEC, CWE, and CVE, to record and classify various attack patterns and vulnerabilities. However, the information between these resource libraries is usually isolated and requires manual correlation, which is not only time - consuming but also error - prone. ### Main Research Questions The paper mainly answers two research questions (RQs): - **RQ1**: How does the sentence transformation model perform in detecting software vulnerabilities from attack texts? - **RQ2**: How many CVE issues can VULDAT correctly detect? ### Method Overview To achieve the above goals, the authors designed a tool VULDAT based on a sentence transformer. The specific steps include: 1. **Data collection**: Collect attack descriptions and vulnerability reports from MITRE's ATT&CK, CAPEC, CWE, and CVE libraries. 2. **Pre - processing**: Clean and standardize the attack description texts, including removing URLs, references, stop - words, etc. 3. **Design VULDAT architecture**: Use a pre - trained sentence transformer model (such as MPNet) to generate embedding vectors for attack descriptions and CVE reports. 4. **Calculate similarity scores**: Determine the most relevant CVE reports by calculating the cosine similarity between the embedding vectors. 5. **Performance evaluation**: Evaluate the performance of VULDAT using metrics such as precision, recall, and F1 - score. ### Main Contributions - Developed the VULDAT tool, providing an automated method for detecting software vulnerabilities. - Created a new annotated mapping data set, explicitly connecting ATT&CK with vulnerabilities in the MITRE library. - Conducted a comparative analysis of multiple sentence transformer models, showing their performance differences under different pre - processing conditions. ### Results According to the experimental results, VULDAT performs well under both partial pre - processing and full pre - processing conditions. Especially when using the multi - qa - mpnet - base - dot - v1 model, the F1 - score reaches 0.85, the precision is 0.86, and the recall is 0.83. In addition, VULDAT can correctly detect about 61% of CVE issues, and the average Jaccard similarity is 0.40. ### Future Work The authors plan to further explore other models (such as sequence - to - sequence models) to improve CVE detection, and conduct large - scale manual inspections to verify the results output by VULDAT and recommend missing links between attacks and vulnerabilities to the MITRE committee.