Efficient Federated Intrusion Detection in 5G ecosystem using optimized BERT-based model

Frederic Adjewa,Moez Esseghir,Leila Merghem-Boulahia
2024-09-28
Abstract:The fifth-generation (5G) offers advanced services, supporting applications such as intelligent transportation, connected healthcare, and smart cities within the Internet of Things (IoT). However, these advancements introduce significant security challenges, with increasingly sophisticated cyber-attacks. This paper proposes a robust intrusion detection system (IDS) using federated learning and large language models (LLMs). The core of our IDS is based on BERT, a transformer model adapted to identify malicious network flows. We modified this transformer to optimize performance on edge devices with limited resources. Experiments were conducted in both centralized and federated learning contexts. In the centralized setup, the model achieved an inference accuracy of 97.79%. In a federated learning context, the model was trained across multiple devices using both IID (Independent and Identically Distributed) and non-IID data, based on various scenarios, ensuring data privacy and compliance with regulations. We also leveraged linear quantization to compress the model for deployment on edge devices. This reduction resulted in a slight decrease of 0.02% in accuracy for a model size reduction of 28.74%. The results underscore the viability of LLMs for deployment in IoT ecosystems, highlighting their ability to operate on devices with constrained computational and storage resources.
Cryptography and Security,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the increasingly complex cybersecurity challenges in the 5G ecosystem, especially the advanced security threats in Internet of Things (IoT) applications such as intelligent transportation, connected healthcare, and smart cities. Specifically, the paper aims to develop a robust intrusion detection system (IDS) to deal with the complex and evolving network attacks in 5G networks and ensure efficient deployment on resource - constrained edge devices. ### Overview of Main Problems 1. **Security Challenges in 5G Networks**: - 5G networks support a wider range of applications and services, such as intelligent transportation, connected healthcare, and smart cities, which introduce new security risks. - Network attacks are becoming more and more complex and personalized, and traditional firewalls and other security measures are difficult to deal with effectively. 2. **Limitations of Existing Intrusion Detection Systems**: - Existing intrusion detection systems (IDS) mainly rely on signature detection (SIDS) or anomaly detection. The former depends on predefined rules, and the latter depends on learning normal traffic. - These methods perform poorly in the face of unknown attacks, especially in a heterogeneous and dense environment like 5G. 3. **Requirement for Privacy Protection**: - In the 5G environment, the use of personal data increases the risk of privacy leakage. Therefore, a method that can perform model training while protecting data privacy is required. ### Solutions To solve the above problems, the paper proposes an intrusion detection system based on federated learning (Federated Learning, FL) and large - language models (LLMs). The core of this system is an optimized BERT model, which is adjusted to adapt to the resource limitations of edge devices. The specific contributions are as follows: 1. **Efficient Federated Intrusion Detection System**: - Use federated learning to collaboratively train the model on multiple devices while keeping the data local, thereby protecting user privacy. - The model performs well in both centralized and federated learning environments. The accuracy rate reaches 97.79% in the centralized environment, and also reaches a relatively high accuracy rate in the federated environment (IID and non - IID data). 2. **Optimized BERT Model**: - By reducing the number of model layers and applying linear quantization, the model size is significantly reduced, making it suitable for deployment on resource - constrained edge devices. - The model size is reduced by 89.85%, and further compressed by 92.76%, but only slightly affects the accuracy (0.02%). 3. **Handling Non - Independent and Identically Distributed (non - IID) Data**: - The paper explores the model convergence under different data distributions (IID vs. non - IID), and finds that more client participation and longer local training time help improve model performance, and can even reach an accuracy rate close to 97% under non - IID data. ### Summary This paper provides an intrusion detection solution that is both efficient and privacy - protecting by combining federated learning and an optimized BERT model, especially suitable for the complex environment of the 5G ecosystem. This research not only shows the potential of LLMs in the field of cybersecurity but also emphasizes its practical application value on resource - constrained devices.