Enhancing Security in Third-Party Library Reuse -- Comprehensive Detection of 1-day Vulnerability through Code Patch Analysis

Shangzhi Xu,Jialiang Dong,Weiting Cai,Juanru Li,Arash Shaghaghi,Nan Sun,Siqi Ma
2024-11-29
Abstract:Nowadays, software development progresses rapidly to incorporate new features. To facilitate such growth and provide convenience for developers when creating and updating software, reusing open-source software (i.e., thirdparty library reuses) has become one of the most effective and efficient methods. Unfortunately, the practice of reusing third-party libraries (TPLs) can also introduce vulnerabilities (known as 1-day vulnerabilities) because of the low maintenance of TPLs, resulting in many vulnerable versions remaining in use. If the software incorporating these TPLs fails to detect the introduced vulnerabilities and leads to delayed updates, it will exacerbate the security risks. However, the complicated code dependencies and flexibility of TPL reuses make the detection of 1-day vulnerability a challenging task. To support developers in securely reusing TPLs during software development, we design and implement VULTURE, an effective and efficient detection tool, aiming at identifying 1-day vulnerabilities that arise from the reuse of vulnerable TPLs. It first executes a database creation method, TPLFILTER, which leverages the Large Language Model (LLM) to automatically build a unique database for the targeted platform. Instead of relying on code-level similarity comparison, VULTURE employs hashing-based comparison to explore the dependencies among the collected TPLs and identify the similarities between the TPLs and the target projects. Recognizing that developers have the flexibility to reuse TPLs exactly or in a custom manner, VULTURE separately conducts version-based comparison and chunk-based analysis to capture fine-grained semantic features at the function levels. We applied VULTURE to 10 real-world projects to assess its effectiveness and efficiency in detecting 1-day vulnerabilities. VULTURE successfully identified 175 vulnerabilities from 178 reused TPLs.
Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **the 1 - day vulnerability detection problem introduced during the reuse of third - party libraries (TPLs)**. Specifically, with the rapid development of software development, developers accelerate the development speed and simplify code maintenance by reusing open - source software (such as third - party libraries). However, this practice also brings security risks. In particular, when third - party libraries are no longer actively maintained, many vulnerable versions will continue to be used. If the software fails to detect and fix these vulnerabilities in a timely manner, it will increase security risks. ### Main problems: 1. **Low maintainability of third - party libraries**: Some third - party libraries may no longer be actively maintained, resulting in incomplete functions or security vulnerabilities. 2. **Complex code dependencies and flexibility**: The reuse methods of third - party libraries are very flexible. They can be fully reused or custom - reused, which makes it complex to detect 1 - day vulnerabilities. 3. **Limitations of existing tools**: Existing tools (such as V1SCAN and MVP) can only handle simple custom - reuse and cannot comprehensively solve the 1 - day vulnerability detection problem. ### Solutions proposed in the paper: To solve the above problems, the author proposes a new tool named **VULTURE**, which aims to effectively detect 1 - day vulnerabilities introduced by the reuse of third - party libraries. The main features of VULTURE include: 1. **Constructing a specialized TPL database**: Through the TPLFILTER method, use large - language models (LLM) to automatically construct a TPL database for a specific platform. This database not only contains common TPL information, but also includes known vulnerabilities and their patch information. 2. **Hash comparison and fine - grained analysis**: VULTURE does not rely on code - level similarity comparison, but adopts a hash - based comparison method to explore the dependencies between the collected TPLs and identify the similarities between TPLs and target projects. 3. **Version and block - level analysis**: To deal with the situation where developers may fix vulnerabilities in a custom way, VULTURE performs version comparison and block - level analysis respectively to capture the fine - grained semantic features at the function level. ### Experimental results: The author applied VULTURE to 10 real - world projects and successfully detected 175 vulnerabilities, demonstrating its effectiveness and efficiency in detecting 1 - day vulnerabilities. ### Formula representation: There are few formulas involved in the paper, but to ensure the correctness and readability of the formulas, here is an example of a key formula mentioned in the paper: - Each function is represented as \( f_c=\langle H, Birth \rangle \), where \( H \) is the hash value of the function and \( Birth \) is the creation time (birth time) of the function. - Each TPL version is represented as \( FC = \{ f_c(i)\mid1\leq i\leq n \} \), where \( n \) is the maximum number of functions included in this TPL version. Through these methods, VULTURE can more accurately detect 1 - day vulnerabilities introduced during the reuse of third - party libraries, helping developers improve software security.