Abstract:Malicious domains pose a severe threat to cybersecurity. As to improve the detection accuracy when the malicious domain variants increase, we proposed a novel malicious domain detection method named MDND‐SS‐PO that combines semi‐supervised learning and parameter optimization. The method extracts the statistical features of the IP address, TTL value, the NXDomain record, and the domain name query characteristics to discriminate Domain‐Flux and Fast‐Flux domain names simultaneously. And an improved DBSCAN based on the neighborhood division is applied semi‐supervised learning with less label efforts. Finally, Gaussian process regression is used to optimize parameter settings of machine learning algorithms. Experimental results show that the proposed method achieved a precise detection performance of 0.885 when the ratio of labeled data is 5%. Malicious domains provide malware with covert communication channels which poses a severe threat to cybersecurity. Despite the continuous progress in detecting malicious domains with various machine learning algorithms, maintaining up‐to‐date various samples with fine‐labeled data for training is difficult. To handle these issues and improve the detection accuracy, a novel malicious domain detection method named MDND‐SS‐PO is proposed that combines semi‐supervised learning and parameter optimization. The contributions of the study are as follows. First, the method extracts the statistical features of the IP address, TTL value, the NXDomain record, and the domain name query characteristics to discriminate Domain‐Flux and Fast‐Flux domain names simultaneously. Second, an improved DBSCAN based on the neighborhood division is designed to cluster labeled data and unlabeled data with low time consumption. Then, based on the clustering hypothesis, unlabeled data is tagged with pseudo‐label according to the cluster results, which aims to train a supervised classifier effectively. Finally, Gaussian process regression is used to optimize parameter settings of the algorithm. And the Silhouette index and F1 score are introduced to evaluate the optimization results. Experimental results show that the proposed method achieved a precise detection performance of 0.885 when the ratio of labeled data is 5%.

Online Malicious Domain Name Detection with Partial Labels for Large-Scale Dependable Systems

Detection Method of Domain Names Generated by DGAs Based on Semantic Representation and Deep Neural Network

Low-rate DoS Attack Detection Method Based on Hybrid Deep Neural Networks.

CNN-based DGA Detection with High Coverage

A Labeled RFS-Based Framework for Multiple Integrity Attackers Detection and Identification in Cyber-Physical Systems

Malicious domain detection based on semi‐supervised learning and parameter optimization

Fast3DS: A real-time full-convolutional malicious domain name detection system

D3N: DGA Detection with Deep-Learning Through NXDomain

Poster: A Pu Learning Based System For Potential Malicious Url Detection

Domain-Embeddings Based DGA Detection with Incremental Training Method

Robust Detection of Malicious URLs with Self-Paced Wide & Deep Learning

Robust KPI Anomaly Detection for Large-Scale Software Services with Partial Labels

The Design and Implementation of a Covering MDN-Complete-Life-Cycle Malicious Domain Detection Framework

Detecting Malicious Domains with Behavioral Modeling and Graph Embedding

Detecting Domain Names Generated by DGAs With Low False Positives in Chinese Domain Names

HANDOM : Heterogeneous Attention Network Model for Malicious Domain Detection

SecureReg: Combining NLP and MLP for Enhanced Detection of Malicious Domain Name Registrations

HGDom: Heterogeneous Graph Convolutional Networks for Malicious Domain Detection

When Less is Enough: Positive and Unlabeled Learning Model for Vulnerability Detection

HinDom: A Robust Malicious Domain Detection System based on Heterogeneous Information Network with Transductive Classification

PUTraceAD: Trace Anomaly Detection with Partial Labels based on GNN and PU Learning