Abstract:Malicious URL classification represents a crucial aspect of cyber security. Although existing work comprises numerous machine learning and deep learning-based URL classification models, most suffer from generalisation and domain-adaptation issues arising from the lack of representative training datasets. Furthermore, these models fail to provide explanations for a given URL classification in natural human language. In this work, we investigate and demonstrate the use of Large Language Models (LLMs) to address this issue. Specifically, we propose an LLM-based one-shot learning framework that uses Chain-of-Thought (CoT) reasoning to predict whether a given URL is benign or phishing. We evaluate our framework using three URL datasets and five state-of-the-art LLMs and show that one-shot LLM prompting indeed provides performances close to supervised models, with GPT 4-Turbo being the best model, followed by Claude 3 Opus. We conduct a quantitative analysis of the LLM explanations and show that most of the explanations provided by LLMs align with the post-hoc explanations of the supervised classifiers, and the explanations have high readability, coherency, and informativeness.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are two key challenges in phishing URL classification: generalization ability and interpretability. Specifically: 1. **Generalization (Generalisation)**: - Existing machine - learning and deep - learning models experience a significant performance decline when dealing with test URLs from different data sources. This is mainly due to the lack of representativeness in the training data set, making it difficult for the model to adapt to new, unseen data. - This problem of insufficient generalization ability stems from the inherent bias in the data set. For example, the data sets of some organizations are biased towards the URLs frequently visited by their employees. 2. **Interpretability (Explainability)**: - Existing URL classification models are usually black - box models and cannot provide natural - language explanations to illustrate why a certain URL is classified as benign or malicious. Such models lacking interpretability make it difficult for users to understand the basis for classification and reduce user trust. - Providing easy - to - understand explanations is crucial for enhancing users' security awareness, especially in the face of a high false - positive rate. To address these issues, the paper proposes a one - shot learning framework based on large - language models (LLM). This framework utilizes Chain - of - Thought (CoT) reasoning to predict whether a URL is phishing and provides a natural - language explanation for each classification. In this way, the paper aims to improve the generalization ability and interpretability of URL classification, thereby better protecting users from phishing attacks. ### The specific contributions of the paper include: - **Proposing a framework based on LLM** that combines CoT reasoning and one - shot learning for phishing URL classification and demonstrates the ability of LLM as an interpretable one - shot classifier. - **Evaluating five state - of - the - art LLMs** and three different phishing URL data sets, and comparing the performance of the framework with existing supervised URL classifiers. - **Demonstrating its performance in one - shot and zero - sample settings**, where GPT - 4 Turbo achieved an average F1 score of 0.92 in the one - shot setting, only 0.07 points lower than the fully - supervised setting. - **Verifying the interpretability of the classification framework**, by comparing the self - explanations of LLM with the post - hoc explanations obtained in the supervised setting, and evaluating the correctness and language quality of the self - explanations of LLM. - **Analyzing the consistency of LLM predictions** as well as the performance in zero - sample and few - sample settings. The results show that increasing the number of examples has little impact on prediction accuracy. Through these contributions, the paper not only improves the accuracy and generalization ability of URL classification but also enhances the interpretability of the classification results, enabling users to better understand and trust the classification results.

LLMs are One-Shot URL Classifiers and Explainers

Exploring LLMs for Malware Detection: Review, Framework Design, and Countermeasure Approaches

Can LLMs be Fooled? Investigating Vulnerabilities in LLMs

Web Content Filtering through knowledge distillation of Large Language Models

Towards Explainable Network Intrusion Detection using Large Language Models

URL and Malicious Link Prediction

Multimodal Large Language Models for Phishing Webpage Detection and Identification

Towards LLM-guided Causal Explainability for Black-box Text Classifiers

LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked

DrLLM: Prompt-Enhanced Distributed Denial-of-Service Resistance Method with Large Language Models

When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs

LLM-Generated Black-box Explanations Can Be Adversarially Helpful

An ensemble classification method based on machine learning models for malicious Uniform Resource Locators (URL)

Next-Generation Phishing: How LLM Agents Empower Cyber Attackers

LLMs Killed the Script Kiddie: How Agents Supported by Large Language Models Change the Landscape of Network Threat Testing

Robust Detection of Malicious URLs with Self-Paced Wide & Deep Learning

A survey on Large Language Model (LLM) security and privacy: The Good, The Bad, and The Ugly

Large Language Model Lateral Spear Phishing: A Comparative Study in Large-Scale Organizational Settings

A Comprehensive Overview of Large Language Models (LLMs) for Cyber Defences: Opportunities and Directions

ChatPhishDetector: Detecting Phishing Sites Using Large Language Models

An intelligent identification and classification system for malicious uniform resource locators (URLs)