Abstract:Large Language Models (LLMs) have gained prominence in various applications, including security. This paper explores the utility of LLMs in scam detection, a critical aspect of cybersecurity. Unlike traditional applications, we propose a novel use case for LLMs to identify scams, such as phishing, advance fee fraud, and romance scams. We present notable security applications of LLMs and discuss the unique challenges posed by scams. Specifically, we outline the key steps involved in building an effective scam detector using LLMs, emphasizing data collection, preprocessing, model selection, training, and integration into target systems. Additionally, we conduct a preliminary evaluation using GPT-3.5 and GPT-4 on a duplicated email, highlighting their proficiency in identifying common signs of phishing or scam emails. The results demonstrate the models' effectiveness in recognizing suspicious elements, but we emphasize the need for a comprehensive assessment across various language tasks. The paper concludes by underlining the importance of ongoing refinement and collaboration with cybersecurity experts to adapt to evolving threats.

What problem does this paper attempt to address?

The paper attempts to address the issue of detecting fraud in the field of cybersecurity using large language models (LLMs). Specifically, it explores how LLMs can be used to identify various types of fraud, such as phishing, advance-fee fraud, and romance scams. Traditional methods have limitations in dealing with these complex and constantly evolving fraud techniques, whereas LLMs, with their powerful natural language processing capabilities, can more effectively identify and prevent these scams. ### Main Research Content: 1. **Data Collection**: Collect data containing various types of fraud and legitimate content to build a diverse dataset. 2. **Data Preprocessing**: Clean and format the text data to make it suitable for model training. 3. **Annotation**: Label each piece of text as "fraud" or "legitimate" through annotators or crowdsourcing platforms. 4. **Model Selection**: Choose an appropriate LLM architecture, such as GPT-3 or BERT, and perform task-specific fine-tuning. 5. **Training**: Train the model using supervised learning techniques to classify the text based on the provided labels. 6. **Evaluation**: Rigorously evaluate the model's performance using metrics such as precision, recall, and F1 score. 7. **Hyperparameter Tuning**: Adjust hyperparameters like learning rate and batch size to optimize model performance. 8. **False Positive Analysis**: Analyze the false positives generated by the model to reduce the instances of legitimate communications being misclassified as fraud. 9. **Threshold Setting**: Determine an appropriate confidence threshold to balance false positives and false negatives. 10. **Integration**: Integrate the trained model into the target system to achieve real-time fraud detection. ### Preliminary Evaluation: The paper conducted a preliminary evaluation using GPT-3.5 and GPT-4 on a suspected phishing email. The results showed that both models were able to identify multiple suspicious signs in the email, such as unusual sender addresses, grammatical errors, suspicious links, and non-personalized greetings. This indicates that LLMs have a high accuracy in identifying common phishing emails. ### Conclusion: Although GPT-3.5 and GPT-4 performed well in this evaluation, the paper emphasizes the need for more comprehensive evaluations to determine the relative strengths and weaknesses of these models in different natural language understanding and generation tasks. Additionally, continuous collaboration with domain experts and cybersecurity professionals to adapt to emerging threats is key to improving the effectiveness of LLMs in fraud detection. With further optimization and improvement, LLMs have the potential to significantly enhance online security measures, protecting individuals and organizations from fraud.

Detecting Scams Using Large Language Models

Evaluating the Efficacy of Large Language Models in Identifying Phishing Attempts

Exposing LLM Vulnerabilities: Adversarial Scam Detection and Performance

Can LLMs be Scammed? A Baseline Measurement Study

ChatPhishDetector: Detecting Phishing Sites Using Large Language Models

Combating Phone Scams with LLM-based Detection: Where Do We Stand?

Multimodal Large Language Models for Phishing Webpage Detection and Identification

ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection

Large Language Models for Cyber Security: A Systematic Literature Review

An Improved Transformer-based Model for Detecting Phishing, Spam, and Ham: A Large Language Model Approach

Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks

Devising and Detecting Phishing Emails Using Large Language Models

Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models

A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly

Detecting Phishing Sites Using ChatGPT

SecureNet: A Comparative Study of DeBERTa and Large Language Models for Phishing Detection

Large Language Model Lateral Spear Phishing: A Comparative Study in Large-Scale Organizational Settings

Large Language Models in Cybersecurity: State-of-the-Art

The Use of Large Language Models (LLM) for Cyber Threat Intelligence (CTI) in Cybercrime Forums