Insight: Exploring Cross-Ecosystem Vulnerability Impacts.

Meiqiu Xu,Ying Wang,Shing-Chi Cheung,Hai Yu,Zhiliang Zhu
DOI: https://doi.org/10.1145/3551349.3556921
2022-01-01
Abstract:Vulnerabilities, referred to as CLV issues, are induced by crosslanguage invocations of vulnerable libraries. Such issues greatly increase the attack surface of Python/Java projects due to their pervasive use of C libraries. Existing Python/Java build tools in PyPI and Maven ecosystems fail to report the dependency on vulnerable libraries written in other languages such as C. CLV issues are easily missed by developers. In this paper, we conduct the first empirical study on the status quo of CLV issues in PyPI and Maven ecosystems. It is found that 82,951 projects in these ecosystems are directly or indirectly dependent on libraries compiled from the C project versions that are identified to be vulnerable in CVE reports. Our study arouses the awareness of CLV issues in popular ecosystems and presents related analysis results. The study also leads to the development of the first automated mechanism, Insight, which provides a turn-key solution to the identification of CLV issues in PyPI and Maven projects based on published CVE reports of vulnerable C projects. Insight automatically identifies if a PyPI or Maven project is using a C library compiled from vulnerable C project versions in published CVE reports. It also deduces the vulnerable APIs involved by analyzing the usage of various foreign function interfaces such as CFFI and JNI in the concerned PyPI or Maven project. Insight achieves a high detection rate of 88.4% on a popular CLV issue benchmark. Contributing to the open-source community, we report 226 CLV issues detected in the actively maintained PyPI and Maven projects that are directly dependent on vulnerable C library versions. Our reports are well received and appreciated by developers with queries on the availability of Insight. 127 reported issues (56.2%) were quickly confirmed by developers and 74.8% of them were fixed/under fixing by popular projects, such as Mongodb [40] and Eclipse/Sumo [19].
What problem does this paper attempt to address?