SoK: On Closing the Applicability Gap in Automated Vulnerability Detection

Ezzeldin Shereen,Dan Ristea,Sanyam Vyas,Shae McFadden,Madeleine Dwyer,Chris Hicks,Vasilios Mavroudis
2024-12-15
Abstract:The frequent discovery of security vulnerabilities in both open-source and proprietary software underscores the urgent need for earlier detection during the development lifecycle. Initiatives such as DARPA's Artificial Intelligence Cyber Challenge (AIxCC) aim to accelerate Automated Vulnerability Detection (AVD), seeking to address this challenge by autonomously analyzing source code to identify vulnerabilities. This paper addresses two primary research questions: (RQ1) How is current AVD research distributed across its core components? (RQ2) What key areas should future research target to bridge the gap in the practical applicability of AVD throughout software development? To answer these questions, we conduct a systematization over 79 AVD articles and 17 empirical studies, analyzing them across five core components: task formulation and granularity, input programming languages and representations, detection approaches and key solutions, evaluation metrics and datasets, and reported performance. Our systematization reveals that the narrow focus of AVD research-mainly on specific tasks and programming languages-limits its practical impact and overlooks broader areas crucial for effective, real-world vulnerability detection. We identify significant challenges, including the need for diversified problem formulations, varied detection granularities, broader language support, better dataset quality, enhanced reproducibility, and increased practical impact. Based on these findings we identify research directions that will enhance the effectiveness and applicability of AVD solutions in software security.
Software Engineering,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to address the applicability gap of Automated Vulnerability Detection (AVD) in practical applications. Specifically, through a systematic literature review, the author aims to answer two main research questions: 1. **How is the current AVD research distributed across its core components?** 2. **What key areas should future research focus on to bridge the gap between AVD's practical application in the software development process?** ### Research Background and Motivation With security vulnerabilities being frequently discovered in open - source and proprietary software, the need for early detection of these vulnerabilities has become particularly urgent. To this end, projects such as DARPA's Artificial Intelligence Cyber Challenge (AIxCC) aim to accelerate the development of AVD by automatically analyzing source code to identify potential security vulnerabilities. However, despite the rapid progress in AVD research, its practical application still faces many challenges. ### Main Problems The paper points out that the current AVD research has the following problems: - **Too Narrow Focus**: Most research focuses on specific tasks and programming languages, resulting in a limited scope of practical applications. - **Lack of Diversity**: Most research focuses on binary - classification tasks and the C/C++ language, ignoring other high - risk languages (such as PHP) and more fine - grained detection methods. - **Insufficient Data Set Quality and Reproducibility**: The quality of existing data sets varies widely, and many studies lack open - science practices, affecting the reliability and reproducibility of results. - **Limited Practical Application Effect**: Although the number of studies is large, the proportion of studies that can actually discover new vulnerabilities is decreasing year by year, indicating that the practical application effect of AVD needs to be improved. ### Research Methods To address the above problems, the author conducted a systematic analysis of 79 AVD articles and 17 empirical studies, covering five core components of AVD: 1. **Task Definition**: Including problem formulation and detection granularity. 2. **Input Software**: Involving programming languages and input representations. 3. **Detection Method**: Covering detection methods and key technologies. 4. **Evaluation**: Including evaluation metrics and data sets. 5. **Performance**: Analyzing the reported performance results. ### Main Findings Through the systematic analysis of the literature, the author draws the following conclusions: - **Binary - Classification Tasks Dominate**: 89.9% of the research focuses on binary - classification tasks, while multi - category CWE classification only accounts for 10.1%, which limits the in - depth understanding of vulnerability types. - **Function - Level Detection is the Most Common**: 68.4% of the research focuses on function - level detection, while line - level and commit - level detections are less common. - **Insufficient Programming Language Support**: Although C/C++ is the focus of research, research on other high - risk languages (such as PHP) is relatively scarce, unable to fully cover practical application scenarios. - **Data Set Quality and Reproducibility Need Improvement**: The quality and openness of existing data sets are insufficient, affecting the reliability and reproducibility of research. ### Future Research Directions Based on the above findings, the author suggests that future AVD research should focus on the following aspects: - **Diversifying Problem Formulation**: Explore more types of classification tasks, especially multi - category CWE classification, to provide more abundant vulnerability information. - **More Fine - grained Detection Methods**: Develop line - level and commit - level detection methods to meet the needs of different stages of software development. - **Expanding Programming Language Support**: Increase support for other high - risk languages (such as PHP) to ensure the wide applicability of AVD. - **Improving Data Set Quality and Reproducibility**: Establish high - quality, diverse data sets and follow open - science practices to improve the reliability of research. In summary, through a systematic literature review, this paper reveals the problems existing in AVD research and points out the direction for future research, aiming to bridge the gap in AVD's practical application and promote its wide application in the field of software security.