Abstract:Attacks on network systems are becoming more and more common, the current state of increasingly sophisticated attack methods, the emergence of intrusion prevention technology is the inevitable result of the development of computer technology and network technology, and research on intrusion prevention has become a new focus of network security technology research in recent years. In order to ensure the security of computer network confidential information, the authors propose a semisupervised clustering intrusion detection algorithm. An overview of machine learning, followed by an explanation of the theory of cluster analysis, simulation experiments were carried out using the K -means algorithm and the semisupervised clustering algorithm proposed by the author, for 10,000 records, the K -means clustering algorithm and the semisupervised clustering algorithm described in this paper are used, respectively, and intrusion detection data tests were performed. At the same time, different K values were selected, three datasets were selected from "kddcup.newtestdata_10_percent_corrected," the test data were tested separately, and their average value was taken as the test result. From the simulation results, the detection rate of the semisupervised clustering algorithm is higher than that of the K -means clustering algorithm, and the false alarm rate and K -means algorithms have also been improved. Therefore, the author's semisupervised algorithm enhances the stability of the system, and the performance of the K -means algorithm is improved to a certain extent. When the value of K gradually increases, the false alarm rate also increases; however, when K is 20, the detection rate is maximized, from this, it can be known that when K is 20, its detection rate reaches 91.76%, and the false alarm rate is 8.54%. The detection rate of the author's algorithm is significantly higher than the other two algorithms, the false positive rate is slightly higher than K -means, and the false positive rate is lower than that of the other algorithm, proving the superior performance of our algorithm.

Analysis of Domain Name Queries Based on the K-Means Algorithm

Visualizing and Characterizing Dns Lookup Behaviors Via Log-Mining

Can We Learn What People Are Doing from Raw DNS Queries?

Inconsistency Between Domain Name and Server Location: Phenomena, Causes, and Countermeasures

DNS Request Log Analysis of Universities in Shanghai: A CDN Service Provider's Perspective

Computer Image Content Retrieval Considering K-Means Clustering Algorithm

Internet Traffic Analysis in a Large University Town: A Graphical and Clustering Approach.

Abnormal Network Traffic Analysis Based on IP Address Clustering

Characterizing Network Anomaly Traffic with Euclidean Distance-Based Multiscale Fuzzy Entropy.

A Hybrid Packet Clustering Approach for Nat Host Analysis

Clustering Approach and Characteristic Indices for Load Profiles of Customers Using Data from AMI

User Online Detection Algorithm Based on DNS Log Analysis

A Nameserver Importance Ranking Method Based on Heterogeneous Information Network

Vector Space Embedding of DNS Query Behaviors by Deep Learning

Identifying IP Usage Scenarios: Problems, Data, and Benchmarks

Computing Geographical Serving Area Based On Search Logs And Website Categorization

Measuring and classifying IP usage scenarios: a continuous neural trees approach

Computer Network Confidential Information Security Based on Big Data Clustering Algorithm

Mining the Web for IP Address Geolocations Chen Chen Chuanxiong

Measurement and Analysis on DNS Configuration Errors in.CN Domain

Detecting Geographical Serving Area of Web Resources.