A Survey on the Applications of Semi-Supervised Learning to Cyber-Security

Paul K. Mvula,Paula Branco,Guy-Vincent Jourdan,Herna L. Viktor
DOI: https://doi.org/10.1145/3657647
IF: 16.6
2024-04-11
ACM Computing Surveys
Abstract:Machine Learning’s widespread application owes to its ability to develop accurate and scalable models. In cyber-security, where labeled data is scarce, Semi-Supervised Learning (SSL) emerges as a potential solution. SSL excels at tasks challenging traditional supervised and unsupervised algorithms by leveraging limited labelled data alongside abundant unlabeled data. This paper presents a comprehensive survey of SSL in cyber-security, focusing on countering diverse cybercrimes, particularly intrusion detection. Despite its potential, a notable research gap persists, with few recent studies comprehensively reviewing SSL’s application in cyber-security. This study examines state-of-the-art SSL techniques tailored for cyber-security to address this gap. Relevant methods are identified, and their effectiveness is evaluated to empower researchers and practitioners with insights to enhance cyber-security measures. This work sheds light on SSL’s potential in addressing data scarcity in cyber-security domains in addition to outlining new research directions to advance this crucial field. By bridging this research gap, this manuscript paves the way for enhanced cyber-threat detection and mitigation in an increasingly interconnected world.
computer science, theory & methods
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problem of difficult machine - learning model training due to scarce labeled data in the field of network security. Specifically: 1. **Scarcity of labeled data**: In the field of network security, the cost of obtaining labeled data is high and it is very difficult, so the labeled data available for training supervised - learning models is very limited. This restricts the effectiveness of traditional supervised - learning and unsupervised - learning algorithms. 2. **Application of semi - supervised learning**: To solve the above problems, the paper explores the application of semi - supervised learning (SSL) in network security. SSL improves model performance by combining a small amount of labeled data with a large amount of unlabeled data, thus making up for the lack of labeled data. 3. **Filling the research gap**: Although the application of SSL in other fields has been studied, its application in the field of network security is relatively rare, especially the lack of a comprehensive review in recent years. Therefore, this paper aims to fill this research gap and systematically summarize and evaluate the current application status of SSL in network security. 4. **Specific application scenarios**: The paper focuses on the application of SSL in various network - security tasks, including intrusion detection, spam and phishing detection, malware detection, etc. These tasks are crucial for dealing with increasingly complex network threats. 5. **Future research directions**: In addition, the paper also points out the challenges existing in the current SSL methods in network - security applications and proposes future research directions to promote the further development of this field. ### Main contributions of the paper - **Provide insights into major network - security threats and trends**: The paper analyzes in detail the efforts and trends in using machine learning, especially SSL methods, to deal with network - security threats. - **Propose a broad classification of SSL concepts and methods**: The paper constructs a detailed classification framework of SSL methods, which is helpful for analyzing and understanding various SSL solutions. - **Comprehensively review the application of SSL in network security**: The paper conducts an exhaustive review of the application of SSL in intrusion detection, spam detection, phishing detection, malware detection, etc. - **Identify open challenges and future research directions**: The paper discusses the open challenges in existing solutions and provides suggestions and insights for future research. Through these contributions, the paper provides valuable references for researchers and practitioners, helping them better understand and apply SSL technology to enhance network - security measures.