ChatGPT and Other Large Language Models for Cybersecurity of Smart Grid Applications

Aydin Zaboli,Seong Lok Choi,Tai-Jin Song,Junho Hong
2024-02-26
Abstract:Cybersecurity breaches targeting electrical substations constitute a significant threat to the integrity of the power grid, necessitating comprehensive defense and mitigation strategies. Any anomaly in information and communication technology (ICT) should be detected for secure communications between devices in digital substations. This paper proposes large language models (LLM), e.g., ChatGPT, for the cybersecurity of IEC 61850-based digital substation communications. Multicast messages such as generic object oriented system event (GOOSE) and sampled value (SV) are used for case studies. The proposed LLM-based cybersecurity framework includes, for the first time, data pre-processing of communication systems and human-in-the-loop (HITL) training (considering the cybersecurity guidelines recommended by humans). The results show a comparative analysis of detected anomaly data carried out based on the performance evaluation metrics for different LLMs. A hardware-in-the-loop (HIL) testbed is used to generate and extract dataset of IEC 61850 communications.
Cryptography and Security,Systems and Control
What problem does this paper attempt to address?
This paper aims to solve the cybersecurity threats faced by communication protocols (such as GOOSE and SV) in power systems. Specifically, the paper focuses on how to use large - language models (LLMs), such as ChatGPT, to improve the cybersecurity protection capabilities of IEC 61850 - based communication in digital substations. ### Main Problems 1. **Cybersecurity Threats**: Any abnormality in the information and communication technology (ICT) of power substations may pose a significant threat to the integrity of the power grid. Therefore, effective detection mechanisms are required to ensure secure communication between devices. 2. **Limitations of Existing Methods**: Traditional intrusion detection systems (IDS) mainly rely on machine - learning (ML) methods. Although these methods are accurate, they need to be frequently retrained to deal with newly emerging attack patterns, which consumes a large amount of time and resources, and the system is vulnerable to attacks during the period when new threats are not included in the model knowledge base. 3. **Adaptability Requirements**: With the increasing diversity and complexity of cyber - attacks, a more dynamic and adaptable method is needed to identify and respond to new types of threats. ### Solutions The paper proposes a cybersecurity framework based on large - language models (LLMs), which combines data pre - processing and human - in - the - loop (HITL) training methods. Specific features are as follows: - **Data Pre - processing**: Pre - process the data of the communication system to improve the input quality of the model. - **Human - in - the - loop (HITL)**: Consider cybersecurity guidelines and enhance the model training process through the advice of human experts. - **Performance Evaluation**: Use a hardware - in - the - loop (HIL) test platform to generate and extract datasets of IEC 61850 communication, and conduct a comparative analysis of the performance of different LLMs. ### Main Contributions 1. **First Application of LLMs**: For the first time, it is proposed to use different LLMs (such as ChatGPT 4.0, Anthropic's Claude 2, and Google Bard/PaLM 2) to detect abnormalities in GOOSE and SV datasets. 2. **HITL as an IDS**: Apply the LLM - based HITL method to the anomaly detection of the IEC 61850 communication protocol. 3. **Algorithm Conversion**: Convert the IDS algorithm into text for training datasets to detect anomalies. ### Experimental Results - **Performance Evaluation**: Different LLMs are compared through multiple performance evaluation metrics (such as true positive rate (TPR), false positive rate (FPR), false negative rate (FNR), precision, and F1 - score). - **Best Performance**: ChatGPT 4.0 outperforms the other two LLMs in all evaluation metrics. Especially at the fully trained level, its TPR is 98.18% (GOOSE) and 96.67% (SV) respectively, and both FPR and FNR are less than 4%, showing excellent performance. ### Conclusion The paper demonstrates the application potential of LLMs and HITL methods in the cybersecurity of digital substations. In particular, ChatGPT 4.0 performs excellently in detecting anomalies in GOOSE and SV messages. Future research will further explore the application of other LLMs, combined with task - oriented dialogue (ToD) and fine - tuning techniques, to improve the efficiency and accuracy of the model.