AI Cyber Risk Benchmark: Automated Exploitation Capabilities

Dan Ristea,Vasilios Mavroudis,Chris Hicks
2024-12-09
Abstract:We introduce a new benchmark for assessing AI models' capabilities and risks in automated software exploitation, focusing on their ability to detect and exploit vulnerabilities in real-world software systems. Using DARPA's AI Cyber Challenge (AIxCC) framework and the Nginx challenge project, a deliberately modified version of the widely used Nginx web server, we evaluate several leading language models, including OpenAI's o1-preview and o1-mini, Anthropic's Claude-3.5-sonnet-20241022 and Claude-3.5-sonnet-20240620, Google DeepMind's Gemini-1.5-pro, and OpenAI's earlier GPT-4o model. Our findings reveal that these models vary significantly in their success rates and efficiency, with o1-preview achieving the highest success rate of 64.71 percent and o1-mini and Claude-3.5-sonnet-20241022 providing cost-effective but less successful alternatives. This benchmark establishes a foundation for systematically evaluating the AI cyber risk posed by automated exploitation tools.
Cryptography and Security,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to evaluate the capabilities and risks of large - language models (LLMs) in automated software vulnerability detection and exploitation. Specifically, by introducing a new benchmarking framework, the paper aims to systematically evaluate these models' ability to discover and exploit vulnerabilities in real - world software systems. ### Main Problems 1. **Evaluating the Automated Vulnerability Exploitation Capability of AI Models**: - The paper focuses on how to evaluate the AI model's ability in automated software vulnerability detection and exploitation. This includes whether the model can identify and exploit vulnerabilities in actual software systems. 2. **Establishing Evaluation Criteria**: - The author hopes to provide a fair and challenging environment for evaluation by using DARPA's AI Cyber Challenge (AIxCC) framework and the modified Nginx Web server project (containing 17 carefully designed vulnerabilities). 3. **Exploring the Double - Edged Sword Effect**: - The application of LLMs in the field of network security has a dual nature: on the one hand, it can accelerate vulnerability detection and repair procedures and enhance security; on the other hand, it may also be used for malicious purposes, such as automated software attacks. Therefore, the paper also discusses the potential risks of these models and the response mechanisms. ### Methodology - **Selecting an Appropriate Test Platform**: The Nginx AIxCC challenge project was selected as the test platform because it provides a clear scope and complex real - world application scenarios and ensures that the test data is not included in the model training set. - **Iterative Improvement Mechanism**: Through the reflexion loop, the model can self - adjust and optimize according to previous failed attempts. - **Multi - Dimensional Evaluation**: The model's performance was comprehensively evaluated from multiple perspectives such as success rate, cost - efficiency, and adaptability. ### Results - **Significant Performance Differences**: There are obvious differences in the success rates among different models. Among them, o1 - preview performs the best, with a success rate of 64.71%; other models such as Claude - 3.5 - sonnet - 20240620 and Gemini - 1.5 - pro also show potential, but with lower success rates. - **Cost - Benefit Analysis**: Although some models have higher costs, they are more cost - effective overall due to their higher success rates. ### Significance This research not only reveals the current potential of LLMs in automated vulnerability exploitation but also emphasizes the need to be cautious in the application of these technologies to avoid possible security threats.