Abstract:Vulnerabilities publicly disclosed in the National Vulnerability Database (NVD) are assigned with CVE (Common Vulnerabilities and Exposures) IDs and associated with specific software versions. Many organizations, including IT companies and government, heavily rely on the disclosed vulnerabilities in NVD to mitigate their security risks. Once a software is claimed as vulnerable by NVD, these organizations would examine the presence of the vulnerable versions of the software and assess the impact on themselves. However, the version information about vulnerable software in NVD is not always reliable. Nguyen et al. find that the version information of many CVE vulnerabilities is spurious and propose an approach based on the original SZZ algorithm (i.e., an approach to identify bug-introducing commits) to assess the software versions affected by CVE vulnerabilities. However, SZZ algorithms are designed for common bugs, while vulnerabilities and bugs are different. Many bugs are introduced by a recent bug-fixing commit, but vulnerabilities are usually introduced in their initial versions. Thus, the current SZZ algorithms often fail to identify the inducing commits for vulnerabilities. Therefore, in this study, we propose an approach based on an improved SZZ algorithm to refine software versions affected by CVE vulnerabilities. Our proposed SZZ algorithm leverages the line mapping algorithms to identify the earliest commit that modified the vulnerable lines, and then considers these commits to be the vulnerability-inducing commits, as opposed to the previous SZZ algorithms that assume the commits that last modified the buggy lines as the inducing commits. To evaluate our proposed approach, we manually annotate the true inducing commits and verify the vulnerable versions for 172 CVE vulnerabilities with fixing commits from two publicly available datasets with five C/C++ and 41 Java projects, respectively. We find that 99 out of 172 vulnerabilities whose version information is spurious. The experiment results show that our proposed approach can identify more vulnerabilities with the true inducing commits and correct vulnerable versions than the previous SZZ algorithms. Our approach outperforms the previous SZZ algorithms in terms of F1-score for identifying vulnerability-inducing commits on both C/C++ and Java projects (0.736 and 0.630, respectively). For refining vulnerable versions, our approach also achieves the best performance on the two datasets in terms of F1-score (0.928 and 0.952).

Vision: Identifying Affected Library Versions for Open Source Software Vulnerabilities

V-SZZ: Automatic Identification of Version Ranges Affected by CVE Vulnerabilities

Patchmatch: A Tool for Locating Patches of Open Source Project Vulnerabilities

Categorizing and Predicting Invalid Vulnerabilities on Common Vulnerabilities and Exposures

Unveil the Mystery of Critical Software Vulnerabilities

Towards More Practical Automation of Vulnerability Assessment

Precise (un)affected Version Analysis for Web Vulnerabilities

Identifying Affected Libraries and Their Ecosystems for Open Source Software Vulnerabilities

Exploiting Library Vulnerability via Migration Based Automating Test Generation

VulNet: Towards improving vulnerability management in the Maven ecosystem

Vulnerable Open Source Dependencies: Counting Those That Matter

LLM-Enhanced Static Analysis for Precise Identification of Vulnerable OSS Versions

CompVPD: Iteratively Identifying Vulnerability Patches Based on Human Validation Results with a Precise Context

VCIPR: Vulnerable Code is Identifiable When a Patch is Released (Hacker's Perspective)

Combining Software Metrics and Text Features for Vulnerable File Prediction

Mvp: Detecting Vulnerabilities Using Patch-Enhanced Vulnerability Signatures

Does the Vulnerability Threaten Our Projects? Automated Vulnerable API Detection for Third-Party Libraries

Automated Mapping of Vulnerability Advisories onto their Fix Commits in Open Source Repositories

SQVDT: A Scalable Quantitative Vulnerability Detection Technique for Source Code Security Assessment.

Discovery of Timeline and Crowd Reaction of Software Vulnerability Disclosures

Enhancing Security in Third-Party Library Reuse -- Comprehensive Detection of 1-day Vulnerability through Code Patch Analysis