Vulnerability discovery based on source code patch commit mining: a systematic literature review
Fei Zuo,Junghwan Rhee
DOI: https://doi.org/10.1007/s10207-023-00795-8
2024-01-07
International Journal of Information Security
Abstract:In recent years, there has been a remarkable surge in the adoption of open-source software (OSS). However, with the growing usage of OSS components in both free and proprietary software, vulnerabilities that are present within them can be spread to a vast array of underlying applications. Even worse, a myriad of vulnerabilities are fixed secretly via patch commits, which causes other software re-using the vulnerable code snippets to be left in the dark. Thus, source code patch commit mining toward vulnerability discovery is receiving immense attention, and a variety of approaches are proposed. Despite that, there is no comprehensive survey summarizing and discussing the current progress within this field. To fill this gap, we survey, evaluate, and systematize a list of literature and provide the community with our insights on both successes and remaining issues in this space. Special attention is paid on the work toward vulnerability discovery. In this paper, we also provide an introductory panorama with our replicable hands-on experience, which can help readers quickly understand and step into the pertinent field. Our empirical study reveals noteworthy challenges which need to be highlighted and addressed in this field. We also discuss potential directions for the future work. To the best of knowledge, we provide the first literature review to study source code patch commit mining in the vulnerability discovery context. The systematic framework, hands-on practices, and list of potential challenges provide new knowledge for mining source code patch commit toward a more robust software eco-system. The research gaps found in this literature review show the need for future research, such as the concern on data quality, high false alarms, and the significance of textual information.
computer science, information systems, theory & methods, software engineering