OSSDetector: Towards a More Accurate Approach for C/C++ Third-Party Library Detection
Jia Zeng,Yaling Zhu,Dan Han,Fangchen Weng,Ruidong Li,Yuqing Zhang
DOI: https://doi.org/10.21203/rs.3.rs-4366210/v1
2024-01-01
Abstract:Abstract In the current software development environment, third-party libraries (TPL) provide developers with rich functionality and convenient solutions, accelerating the speed and efficiency of software development. However, with the widespread adoption of TPL, related security risks have become increasingly apparent. Software Composition Analysis (SCA) is crucial for detecting and managing TPLs in software projects to address these growing security challenges. However, existing SCA tools for C/C++ software projects face several challenges, including the consequences of modified and nested TPL, the lack of precise version representation, and a comprehensive TPL database. Unfortunately, existing SCA tools are inadequate in addressing these challenges and do not detect the precise TPL version. We propose a new SCA tool called OSSDetector to identify TPLs and TPL versions in C/C++ projects to alleviate these issues. OSSDetector, based on sliding window and fuzzy hashing techniques, generates finer-grained signatures to identify modified TPL and mitigate its consequences. Furthermore, it uses a “Nested TPL Function Filtering" algorithm based on multiple features to identify and filter nested TPL function signatures in each TPL to mitigate the consequences of nested TPL. Additionally, it leverages a “TPL Recognition" algorithm based on import ratios and function paths to determine the TPL used in the target C/C++ software. It also determines TPL versions based on function weights and version release times to mitigate the lack of precise version representation. To mitigate the lack of a comprehensive TPL database, we construct a large TPL database containing 29,416 C/C++ TPLs covering 767,405 versions. Experimental results demonstrate that compared to state-of-the-art tools, OSSDetector achieves better precision (85.52%), recall (79.82%), and F1 score (82.57%) at the library level, with improvements of 3.18%, 3.59%, and 3.40%, respectively. At the library version level, OSSDetector also exhibits higher precision (84.27%), outperforming state-of-the-art tools by 4.64%.