Abstract:Large language models (LLMs) have demonstrated impressive results on natural language tasks, and security researchers are beginning to employ them in both offensive and defensive systems. In cyber-security, there have been multiple research efforts that utilize LLMs focusing on the pre-breach stage of attacks like phishing and malware generation. However, so far there lacks a comprehensive study regarding whether LLM-based systems can be leveraged to simulate the post-breach stage of attacks that are typically human-operated, or "hands-on-keyboard" attacks, under various attack techniques and environments.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to explore the potential of large - language models (LLMs) in the automated execution of cyber - attacks, especially those "keystroke - based" attack phases that usually require manual operations. Although LLMs have demonstrated strong capabilities in natural - language - processing tasks, their applications in the field of cybersecurity, especially in simulating post - intrusion - phase attacks, have not been fully studied. The author points out that with the progress of LLMs technology, these models may be used to automatically execute the entire attack process from the early to the late stage, which may transform attacks against organizations from rare professional events into frequent automated operations, without the need for professional knowledge, and be executed at an automated speed and scale. This transformation has the potential to fundamentally change the global computer - security situation and lead to significant economic impacts. To address this challenge, the paper proposes a system based on LLM guidance - AUTOATTACKER, which aims to automate the execution of "keystroke - based" attacks, covering different attack techniques and environmental settings. This system iteratively interacts with the LLM through four modules (summarizer, planner, navigator, and experience manager) to generate precise attack commands. These modules are designed to overcome the challenges encountered when directly using LLMs, such as difficulty in context tracking, high command - space complexity, and insufficient recognition of subtle environmental differences. Specifically, the main contributions of the paper include: 1. For the first time, comprehensively evaluating the potential of applying LLMs to human - like "keystroke - based" attacks. 2. Designing a new system, AUTOATTACKER, for using LLMs to automate attacks, proposing a modular agent design that can obtain precise attack commands from LLMs, and introducing new reasoning and planning procedures. 3. Developing a new benchmark for evaluating LLM - based attack automation, with attack tasks ranging from basic to advanced. 4. Evaluating the effectiveness of AUTOATTACKER, and the results show that when using GPT - 4 as support, all attack tasks can be successfully completed. Through this research, the author hopes to better understand these risks in order to prepare in advance to meet the challenges brought by more advanced LLMs in the future, which are inevitable.

AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks

Large Language Models in Cybersecurity: State-of-the-Art

Large Language Models for Cyber Security: A Systematic Literature Review

A Comprehensive Survey of Attack Techniques, Implementation, and Mitigation Strategies in Large Language Models

Recent Advances in Attack and Defense Approaches of Large Language Models

Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks

Exploring Vulnerabilities and Protections in Large Language Models: A Survey

A Survey of Large Language Models for Cyber Threat Detection

The Best Defense is a Good Offense: Countering LLM-Powered Cyberattacks

A Survey of Large Language Models in Cybersecurity

L-AutoDA: Large Language Models for Automatically Evolving Decision-based Adversarial Attacks

L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks

CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

Jailbreaking and Mitigation of Vulnerabilities in Large Language Models

Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models

ThreatModeling-LLM: Automating Threat Modeling using Large Language Models for Banking System

Emerging Security Challenges of Large Language Models

Large language models in 6G security: challenges and opportunities

Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges

Adversarial Attacks and Defenses in Large Language Models: Old and New Threats