Abstract:Cybersecurity breaches targeting electrical substations constitute a significant threat to the integrity of the power grid, necessitating comprehensive defense and mitigation strategies. Any anomaly in information and communication technology (ICT) should be detected for secure communications between devices in digital substations. This paper proposes large language models (LLM), e.g., ChatGPT, for the cybersecurity of IEC 61850-based digital substation communications. Multicast messages such as generic object oriented system event (GOOSE) and sampled value (SV) are used for case studies. The proposed LLM-based cybersecurity framework includes, for the first time, data pre-processing of communication systems and human-in-the-loop (HITL) training (considering the cybersecurity guidelines recommended by humans). The results show a comparative analysis of detected anomaly data carried out based on the performance evaluation metrics for different LLMs. A hardware-in-the-loop (HIL) testbed is used to generate and extract dataset of IEC 61850 communications.

What problem does this paper attempt to address?

This paper aims to solve the cybersecurity threats faced by communication protocols (such as GOOSE and SV) in power systems. Specifically, the paper focuses on how to use large - language models (LLMs), such as ChatGPT, to improve the cybersecurity protection capabilities of IEC 61850 - based communication in digital substations. ### Main Problems 1. **Cybersecurity Threats**: Any abnormality in the information and communication technology (ICT) of power substations may pose a significant threat to the integrity of the power grid. Therefore, effective detection mechanisms are required to ensure secure communication between devices. 2. **Limitations of Existing Methods**: Traditional intrusion detection systems (IDS) mainly rely on machine - learning (ML) methods. Although these methods are accurate, they need to be frequently retrained to deal with newly emerging attack patterns, which consumes a large amount of time and resources, and the system is vulnerable to attacks during the period when new threats are not included in the model knowledge base. 3. **Adaptability Requirements**: With the increasing diversity and complexity of cyber - attacks, a more dynamic and adaptable method is needed to identify and respond to new types of threats. ### Solutions The paper proposes a cybersecurity framework based on large - language models (LLMs), which combines data pre - processing and human - in - the - loop (HITL) training methods. Specific features are as follows: - **Data Pre - processing**: Pre - process the data of the communication system to improve the input quality of the model. - **Human - in - the - loop (HITL)**: Consider cybersecurity guidelines and enhance the model training process through the advice of human experts. - **Performance Evaluation**: Use a hardware - in - the - loop (HIL) test platform to generate and extract datasets of IEC 61850 communication, and conduct a comparative analysis of the performance of different LLMs. ### Main Contributions 1. **First Application of LLMs**: For the first time, it is proposed to use different LLMs (such as ChatGPT 4.0, Anthropic's Claude 2, and Google Bard/PaLM 2) to detect abnormalities in GOOSE and SV datasets. 2. **HITL as an IDS**: Apply the LLM - based HITL method to the anomaly detection of the IEC 61850 communication protocol. 3. **Algorithm Conversion**: Convert the IDS algorithm into text for training datasets to detect anomalies. ### Experimental Results - **Performance Evaluation**: Different LLMs are compared through multiple performance evaluation metrics (such as true positive rate (TPR), false positive rate (FPR), false negative rate (FNR), precision, and F1 - score). - **Best Performance**: ChatGPT 4.0 outperforms the other two LLMs in all evaluation metrics. Especially at the fully trained level, its TPR is 98.18% (GOOSE) and 96.67% (SV) respectively, and both FPR and FNR are less than 4%, showing excellent performance. ### Conclusion The paper demonstrates the application potential of LLMs and HITL methods in the cybersecurity of digital substations. In particular, ChatGPT 4.0 performs excellently in detecting anomalies in GOOSE and SV messages. Future research will further explore the application of other LLMs, combined with task - oriented dialogue (ToD) and fine - tuning techniques, to improve the efficiency and accuracy of the model.

ChatGPT and Other Large Language Models for Cybersecurity of Smart Grid Applications

A Novel Generative AI-Based Framework for Anomaly Detection in Multicast Messages in Smart Grid Communications

Leveraging Conversational Generative AI for Anomaly Detection in Digital Substations

Applying Large Language Models to Power Systems: Potential Security Threats

Simulation of Multi-Stage Attack and Defense Mechanisms in Smart Grids

A novel hybrid methodology to secure GOOSE messages against cyberattacks in smart grids

Risks of Practicing Large Language Models in Smart Grid: Threat Modeling and Validation

Cybersecurity Deployment in Smart Grids: Critical Review, Applications, Protection, and Challenges

Demo Abstract: A HIL Emulator-Based Cyber Security Testbed for DC Microgrids

Exploring the Limits of ChatGPT in Software Security Applications

SDN-Based Dynamic Cybersecurity Framework of IEC-61850 Communications in Smart Grid

Machine Learning Based Cyber System Restoration for IEC 61850 Based Digital Substations

Vulnerability of Machine Learning Approaches Applied in IoT-based Smart Grid: A Review

A review on machine learning techniques for secured cyber-physical systems in smart grid networks

CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

Large language models in 6G security: challenges and opportunities

Implementation of a Trust-Based Framework for Substation Defense in the Smart Grid

AI-based Attacker Models for Enhancing Multi-Stage Cyberattack Simulations in Smart Grids Using Co-Simulation Environments

Towards Automated Generation of Smart Grid Cyber Range for Cybersecurity Experiments and Training

Machine learning for cybersecurity in smart grids: A comprehensive review-based study on methods, solutions, and prospects

Detection of Compromised Smart Grid Devices with Machine Learning and Convolution Techniques