PDGraph: A Large-Scale Empirical Study on Project Dependency of Security Vulnerabilities
Qiang Li,Jinke Song,Dawei Tan,Haining Wang,Jiqiang Liu
DOI: https://doi.org/10.1109/dsn48987.2021.00031
2021-01-01
Abstract:The reuse of libraries in software development has become prevalent for improving development efficiency and software quality. However, security vulnerabilities of reused libraries propagated through software project dependency pose a severe security threat, but they have not yet been well studied. In this paper, we present the first large-scale empirical study of project dependencies with respect to security vulnerabilities. We developed PDGraph, an innovative approach for analyzing publicly known security vulnerabilities among numerous project dependencies, which provides a new perspective for assessing security risks in the wild. As a large-scale software collection in dependency, we find 337,415 projects and 1,385,338 dependency relations. In particular, PDGraph generates a project dependency graph, where each node is a project, and each edge indicates a dependency relationship. We conducted experiments to validate the efficacy of PDGraph and characterized its features for security analysis. We revealed that 1,014 projects have publicly disclosed vulnerabilities, and more than 67,806 projects are directly dependent on them. Among these, 42,441 projects still manifest 67,581 insecure dependency relationships, indicating that they are built on vulnerable versions of reused libraries even though their vulnerabilities are publicly known. During our eight-month observation period, only 1,266 insecure edges were fixed, and corresponding vulnerable libraries were updated to secure versions. Furthermore, we uncovered four underlying dependency risks that can significantly reduce the difficulty of compromising systems. We conducted a quantitative analysis of dependency risks on the PDGraph.