AI Code Generators for Security: Friend or Foe?

Roberto Natella,Pietro Liguori,Cristina Improta,Bojan Cukic,Domenico Cotroneo
DOI: https://doi.org/10.1109/MSEC.2024.3355713
2024-02-02
Abstract:Recent advances of artificial intelligence (AI) code generators are opening new opportunities in software security research, including misuse by malicious actors. We review use cases for AI code generators for security and introduce an evaluation benchmark.
Cryptography and Security,Artificial Intelligence,Software Engineering
What problem does this paper attempt to address?
This paper aims to explore the applications and potential impacts of AI code generators in the field of network security. Specifically, the paper attempts to address the following core issues: 1. **Safe Applications and Misuses of AI Code Generators**: - The paper discusses how AI code generators can be used to support network security research and may also be misused by malicious actors. The author believes that network security professionals need to utilize AI code generators to better prevent and mitigate intrusions. 2. **Establishment of Evaluation Benchmarks**: - In order to systematically evaluate the performance of AI code generators in generating synthetic attacks, the paper introduces a new data set and an experimental evaluation method. This data set contains a series of security - related Python programs, accompanied by natural - language descriptions. 3. **Potential Benign Applications of Generating Synthetic Attacks**: - The paper explores multiple potential benign applications of synthetic - attack generation in penetration testing, such as attack - surface analysis, open - source intelligence (OSINT), vulnerability exploitation, and post - exploitation activities. 4. **Experimental Evaluation**: - The author conducted an experimental evaluation on three popular large - language models (GitHub Copilot, Amazon CodeWhisperer, and CodeBERT), comparing their performance in generating single - line code, multi - line code blocks, and entire functions. The experimental results show that the fine - tuned CodeBERT performs best in generating security - related code. 5. **Future Research Directions**: - The paper also discusses the future research directions of generative AI in the security field, emphasizing the importance of data sets and how to improve the performance of models in security tasks through fine - tuning. In conclusion, through empirical research and data analysis, this paper aims to provide guidance for network security professionals on how to effectively utilize AI code generators and explores the potential applications and challenges of this technology in the security field.