Abstract:DoS and DDoS attacks have been growing in size and number over the last decade and existing solutions to mitigate these attacks are largely inefficient. Compared to other types of malicious cyber attacks, DoS and DDoS attacks are particularly challenging to combat. Because of their ability to mask themselves as legitimate traffic, it has proven difficult to develop methods to detect these types of attacks on a packet or flow level. In this paper, we explore the potential of Variational Autoencoders to serve as a component within an intelligent security solution that differentiates between normal and malicious traffic. The motivation behind resorting to Variational Autoencoders is that unlike normal encoders that would code an input flow as a single point, they encode a flow as a distribution over the latent space which avoids overfitting. Intuitively, this allows a Variational Autoencoder to not only learn latent representations of seen input features, but to generalize in a way that allows for an interpretation of unseen flows and flow features with slight variations. Two methods based on the ability of Variational Autoencoders to learn latent representations from network traffic flows of both benign and malicious traffic, are proposed. The first method resorts to a classifier based on the latent encodings obtained from Variational Autoencoders learned from traffic traces. The second method is an anomaly detection method, where the Variational Autoencoder is used to learn the abstract feature representations of exclusively legitimate traffic. Anomalies are then filtered out by relying on the reconstruction loss of the Variational Autoencoder. In this sense, the construction loss of the autoencoder is fed as input to a classifier that outputs the class of the traffic including benign and malign, and eventually the attack type. Thus, the second approach operates with two separate training processes on two separate data sources: the first training involving only legitimate traffic, and the second training involving all traffic classes. This is different from the first approach which operates only a single training process on the whole traffic dataset. Thus, the autoencoder of the first approach aspires to learn a general feature representation of the flows while the autoencoder of the second approach aims to exclusively learn a representation of the benign traffic. The second approach is thus more susceptible to finding zero day attacks and discovering new attacks as anomalies. Both of the proposed methods have been thoroughly tested on two separate datasets with a similar feature space. The results show that both methods are promising, with the classifier-based method being slightly superior to the anomaly-based one.

HTTP2vec: Embedding of HTTP Requests for Detection of Anomalous Traffic

HTTPSmell: A Deep Learning Approach on Malicious HTTP Traffic Detection via Data Augmentation and Label Refactoring

Encrypted Malicious Traffic Detection Based on Word2Vec

DoS and DDoS mitigation using Variational Autoencoders

DeepHTTP: Semantics-Structure Model with Attention for Anomalous HTTP Traffic Detection and Pattern Mining

motif2vec: Motif Aware Node Representation Learning for Heterogeneous Networks

Log2vec: A Heterogeneous Graph Embedding Based Approach for Detecting Cyber Threats within Enterprise

A method based on hierarchical spatiotemporal features for trojan traffic detection

Packet2Vec: Utilizing Word2Vec for Feature Extraction in Packet Data

Detecting Malicious Web Requests Using an Enhanced TextCNN.

Attention-based Encoder-Decoder Recurrent Neural Networks for HTTP Payload Anomaly Detection

Detecting Web Attacks From HTTP Weblogs Using Variational LSTM Autoencoder Deviation Network

HSTF-Model: An HTTP-based Trojan detection model via the Hierarchical Spatio-temporal Features of Traffics

Anomaly detection system for network transport with machine learning approach

Detecting unknown HTTP-based malicious communication behavior via generated adversarial flows and hierarchical traffic features

highway2vec -- representing OpenStreetMap microregions with respect to their road network characteristics

Unseen Attack Detection in Software-Defined Networking Using a BERT-Based Large Language Model

Network Traffic Anomaly Detection Using Recurrent Neural Networks

Anomaly-Based Web Attack Detection: A Deep Learning Approach

End-To-End Anomaly Detection for Identifying Malicious Cyber Behavior through NLP-Based Log Embeddings

Network Anomaly Detection Using Federated Learning